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MODIFIED ZINC FINGER BINDING PROTEINS 

TECHNICAL FIELD 
The methods and compositions disclosed herein relate generally to the field of 
regulation of gene expression and specifically to methods of modulating gene 
expression by utilizing polypeptides derived from zinc finger-nucleotide binding 
proteins. 

BACKGROUND 

Sequence-specific binding of proteins to DNA, RNA, protein and other 
molecules is involved in a number of cellular processes such as, for example, 
transcription, replication, chromatin structure, recombination, DNA repair, RNA 
processing and translation. The binding specificity of cellular binding proteins that 
participate in protein-DNA, protein-RNA and protein-protein interactions contributes 
to development, differentiation and homeostasis. Alterations in specific protein 
interactions can be involved in various types of pathologies such as, for example, 
cancer, cardiovascular disease and infection. 

Zinc finger proteins (ZFPs) are proteins that can bind to DNA in a sequence- 
specific manner. Zinc fingers were first identified in the transcription factor TFIHA 
from the oocytes of the African clawed toad, Xenopus laevis. A single zinc finger 
domain of this class of ZPFs is about 30 amino acids in length, and several structural 
studies have demonstrated that it contains a beta turn (containing the two invariant 
cysteine residues) and an alpha helix (containing the two invariant histidine residues), 
which are held in a particular conformation through coordination of a zinc atom by 
the two cysteines and the two histidines. This class of ZFPs is also known as C2H2 
ZFPs. Additional classes of ZFPs have also been suggested. (See, e.g., Jiang et al. 
(1996)7. Biol Chem. 271:10723-10730 for a discussion of Cys-Cys-His-Cys (C3H) 
ZPFs.) To date, over 10,000 zinc ringer sequences have been identified in several 
thousand known or putative transcription factors. Zinc finger domains are involved 
not only in DNA recognition, but also in RNA binding and in protein-protein binding. 
Current estimates are that this class of molecules will constitute about 2% of all 
human genes. 
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Most zinc finger proteins have conserved cysteine and histidine residues that 
tetrahedrally-coordinate the single zinc atom in each finger domain. In particular, 
most ZFPs are characterized by finger components of the general sequence:-Cys-(X)2. 
4-Cys-(X) I2 -His-(X)3. 5 -His (SEQ ID NO: 1), where X is any amino acid (the C2H2 
ZFPs). The zinc-coordinating sequences of this most widely represented class contain 
two cysteines and two histidines with particular spacings, for example zinc fingers 
found in the yeast protein ADRI, the human male associated protein ZFY, the HIV 
enhancer protein and the Xenopus protein Xfin have been solved by high resolution 
NMR methods (Kochoyan, et al., Biochemistry, 30:3371-3386, 1991; Omichinski, et 
al., Biochemistry, 29:9324-9334, 1990; Lee, et al., Science, 245:635-637, 1989). 
Based on x-ray crystallography, the three-dimensional structure of a three finger 
polypeptide-DNA complex derived from the mouse immediate early protein zif268 
(also known as Krox-24) has been solved. (Pavletich and Pabo, Science, 252:809- 
817, 1991). The folded structure of each finger contains an antiparallel P-tum, a 
finger tip region and a short amphipathic a-helix. The metal coordinating ligands 
bind to the Zn ion and, in the case of zif268 zinc fingers, the short amphipathic <x- 
helix binds in the major groove of DNA. In addition, the conserved hydrophobic 
amino acids and zinc coordination by the cysteine and histidine residues stabilize the 
structure of the individual finger domain. 

The folding of a C2H2 ZFP into the proper finger structure can be entirely 
disrupted by exchange of the C2H2 ligand amino acids. Miura et al. (1998) Biochim. 
Biophys. Acta 1384:171-179. Furthermore, metal binding specificity of peptides 
based on the C2H2 consensus sequence can be altered. Krizek et al. (1993) Inorg. 
Chem. 32:937-940; Merkle et al. (1991) /. Am Chem. Soc. 113:5450-5451. Although 
detailed models for the interaction of zinc fingers and DNA have also been proposed 
(Berg, 1988; Berg, 1990; Oiurchill, et al., 1990), mutations in finger 2 of the three- 
fingered C2H2 ZFP zif268 have been shown to entirely abolish DNA binding (Green 
et al. (1998) Biochem J. 333:85-90). 

Nonetheless, increased understanding of the nature and mechanism of protein 
binding specificity has encouraged the hope that specificity of a binding protein could 
be altered in a predictable fashion, or that a binding protein of predetermined 
specificity could be constructed de novo. See, for example, Blackburn (2000) Curr. 
Opin. Struct. Biol. 10:399-400; Segal et al. (2000) Curr. Opin. Chem. Biol. 4:34-39. 
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To this end, attempts have been made to modify C2H2 zinc finger proteins. See, e.g., 
U.S. Patent Nos. 6,007,988; 6,013,453; 6,140,081; PCT WO98/53057; 
PCTWO98/53058; PCT WO98/53059; PCT WO98/53060; PCT WO00/23464; 
PCT WO 00/42219; Choo et al. (2000) Curr. Opin. Struct. Biol. 10:41 IM6; Segal 
5 et al. (2000) Curr. Opin. Chem. Biol 4:34-39; and references cited in these 
publications. 

To date, however, cellular studies using designed C2H2 ZFPs have utilized 
relatively few positions in the zinc finger as adjustable parameters to obtain optimal 
activity. In particular, studies to date have modified only those residues at the finger 

10 - DNA interface. These have included positions known to make direct base contacts, 
•supporting' or 'buttressing' residues immediately adjacent to the base-contacting 
positions, and positions capable of contacting the phosphate backbone of the DNA. 
Furthermore, many observed effects have been quite modest, and the possibility that 
improved ZFP activities might be achieved via substitution of residues at other 

15 positions in the finger or using non-C2H2 polypeptides has remained completely 
uninvestigated. 

Thus, there exists a need for additional designed or selected zinc finger 
binding proteins. 

20 SUMMARY 

Disclosed herein are binding proteins, particular zinc finger binding proteins, 
with modified metal co-ordination sites. Methods of making and using these proteins 
are also provided. In preferred embodiments, the binding proteins contain three zinc 
coordinating fingers and one or more of these fingers are modified, non-canonical 

25 (e.g., non-C2H2) finger components. Preferably, the third finger of a three-finger 
ZFP is modified and non-canonical. 

In one aspect, an isolated, non-canonical zinc finger binding protein comprising 
one or more non-canonical zinc finger components that bind to a target sequence is 
provided. The isolated zinc finger binding protein can be provided as a nucleic acid 

30 molecule or as a polypeptide. Furthermore, the target sequence can be an amino acid, 
DNA (e.g., promoter sequence) or RNA and, additionally, may be in a prokaryotic (e.g., 
bacteria) or eukaryotic cell (e.g., plant cell, yeast cell, fungal cell, animal such as human). 
In certain embodiments, the amino acid sequence of one or more of the zinc finger 
components is Xa-B-X^-Cys-Xn-His-X.^-His-X^; X 3 -Cys-X M -B-X 12 -His- X^-His-X,; 
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X 3 -Cys-X2^-Cys-X 12 -Z- X1.7-H1S-X4; X 3 -Cys-X 2 ^-Cys-Xi2-His-Xi. 7 -Z-X4; X 3 -B-X 2 _4- 
B-Xu-ffis.Xu7-His.X4; X3-B-X 2 ^Cys-X l2 -Z-Xi. 7 -His-X4; Xs-B-X^-Cys-Xn-His-X!^- 
Z-X,; X 3 -Cys-X 2 ^-B-X 12 -Z-X^ 7 -His-X4; Xa-Cys-X^.B-Xn-His-X,^^; X 3 -Cys-X 2 ^- 
Cys-Xu-Z-X^-Z-X,; X 3 -Cys-X 2 ^-B-X 12 -Z-X 1 .7-Z-X 4 ; X 3 -B-X 2 ^-Cys-X, 2 -Z-X 1 . 7 -Zr.X 4 ; 
5 Xj-B-X^-B-Xu-His-XLT-Z-Xj; Xa-B-Xa^-B-Xn-Z-Xi^His^; and X 3 -B-X 2 ^B-X 12 -Z- 
Xi. 7 -Z-X4, wherein X is any amino acid, B is any amino acid except cysteine and Z is any 
amino acid except histidine. 

The modified zinc finger proteins described herein can include any number of zinc 
coordinating finger components in which one or more of the zinc finger coordinates are 

10 non-canonical. In preferred embodiments, the ZFP comprises three fingers, wherein one 
or more of the finger components is non-canonical. In certain embodiments, the third zinc 
finger component is non-canonical. In other embodiments, any of the ZFPs described 
herein comprise a modified plant ZFP backbone. 

In other aspects, fusion polypeptides comprising (a) any of the zinc finger binding 

15 proteins described herein and (b) at least one functional domain are provided. The 

functional domain may be, for example a repressive domain such as KRAB, MBD-2B, v- 
ErbA, MBD3, TR, and members of the DNMT family; an activation domain such as 
VP 16, p65 subunit of NF-kappa B, and VP64; an insulator domain; a chromatin 
remodeling protein; and/or a methyl binding domain. 

20 In other aspects, polynucleotides encoding any of the zinc finger proteins (or fusion 

molecules) described herein are provided. Expression vectors and host cells comprising 
these polynucleotides are also provided. 

In yet other aspects, a method of modulating expression of a gene is provided. The 
method comprises the step of contacting a region of DNA with any of the zinc finger 

25 containing fusion molecules described herein. In certain embodiments, the zinc finger 
binding protein of the fusion molecule binds to a target site in a gene encoding a product 
selected from the group consisting of vascular endothelial growth factor, erythropoietin, 
androgen receptor, PPAR-y2, pl6, p53, pRb, dystrophin and e-cadherin, delta-9 
desaturase, delta- 1 2 desaturases from other plants, delta- 1 5 desaturase, acetyl-CoA 

30 carboxylase, acyl-ACP-thioesterase, ADP-glucose pyrophosphorylase, starchsynthase, 
cellulose synthase, sucrose synthase, senescence-associated genes, heavy metalchelators, 
fatty acid hydroperoxide lyase, polygalacturonase, EPSP synthase, plant viral genes, plant 
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fungal pathogen genes, and plant bacterial pathogen genes. (See, also WO 00/41566). 
The gene may in any cell, for example a plant cell or animal (eg., human) cell. 

hi still further aspects, compositions comprising any of the zinc finger proteins (or 
fusion) molecules described herein and a pharmaceutically acceptable excipient are 
5 provided. 

These and other embodiments will readily occur to those of skill in the art in 
light of the disclosure herein. 

BRIEF DESC RIPTION OF THE DRAWINGS 
10 Figure 1 is a graph depicting levels of LCK gene mRNA (normalized to 18S 

rRNA levels) in cells transfected with constructs encoding fusions of the VP16 
activation domain with a canonical ZFP (PTP2), a modified ZFP (PTP2(H-»C), and a 
control construct (NVF). 

Figure 2 shows VEGF-A levels in the culture medium of cells that had been 
15 transfected with plasmids encoding non-canonical ZFP fusion proteins comprising a 
VP16 activation domain, that were targeted to the VEGF gene. Mock indicates 
untransfected cells; empty vector indicates transfection with a DNA construct lacking 
sequences encoding a fusion protein; and C2H2 indicates cells transfected with 
plasmids encoding the canonical C2H2 VOP30A and VOP32B ZFP-VP16 fusion 
20 proteins. S, E, K, CT, C, GC and GGC indicate non-canonical derivatives of 

VOP30A and VOP 32B containing a C2HC zinc finger, as described in Table 1. The 
left-hand bar of each pair shows results for VOP30A and its non-canonical 
derivatives; the right-hand bar of each pair shows results for VOP32B and its non- 
canonical derivative. The C derivative of VOP32B and the GC derivative of 
25 VOP30A were not tested. Results are the average of two determinations. 

Figure 3, panels A and B, are schematics depicting construction of the YCF3 
expression vector useful in expressing modified ZFPs. 

Figure 4 shows the results of analysis of GMT mRNA in RNA isolated from 
Arabidopsis thaliana protoplasts that had been transfected with constructs encoding 
30 fusion of a transcriptional activation domain with various modified plant ZFPs. 

Results are expressed as GMT mRNA normalized to 18S rRNA. AGMT numbers on 
the abscissa refer to the modified plant ZFP binding domains shown in Table 2. 
Duplicate TaqMan® analyses are shown for each RNA sample. 
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DETAILED DESCRIPTION 

General 

The present disclosure provides isolated, non-canonical zinc finger binding 
polypeptides (ZFPs), wherein one or more of the zinc finger components differs from 
5 the canonical consensus sequence of Cys-Cys-His-His (e.g., Cys2-His2). The 
polypeptide can be a fusion polypeptide and, either by itself or as part of such a 
fusion, can enhance or suppress transcription of a gene, and may bind to DNA, RNA 
and/or protein. Polynucleotides encoding non-canonical ZFPs and fusion proteins 
comprising one or more non-canonical ZFPs are also provided. Additionally provided 

10 are pharmaceutical compositions comprising a therapeutically effective amount of any 
of the modified zinc finger-nucleotide binding polypeptides described herein or 
functional fragments thereof; or a therapeutically effective amount of a nucleotide 
sequence that encodes any of the modified zinc finger-nucleotide binding 
polypeptides or functional fragments thereof, wherein the zinc finger polypeptide or 

1 5 functional fragment thereof binds to a cellular nucleotide sequence to modulate the 
function of the cellular nucleotide sequence, in combination with a pharmaceutically 
acceptable carrier. Also provided are screening methods for obtaining a modified 
zinc finger-nucleotide binding polypeptide which binds to a cellular or viral 
nucleotide sequence. 

20 Currently, designed and/or selected ZFPs utilize relatively few positions in the 

zinc finger as adjustable parameters to obtain optimal activity. In particular, studies 
to date have altered only those residues at the finger - DNA interface. See, e.g., U.S. 
Patent Nos. 6,007,988; 6,013,453; 6,140,081 and 6,140,466, as well as PCT WO 
00/42219. As noted above, the observed effects have been quite modest, and the 

25 possibility that improved ZFP activities might be accessible via substitution of 
residues at other positions in the finger has not been investigated. 

Accordingly, in one embodiment, modified (e.g, non-canonical) zinc finger 
proteins are described in which the sequence of one or more zinc fingers of the ZFP 
differs from the canonical consensus sequence containing two cysteine (Cys) residues 

30 and two histidine (His) residues: 

X3"Cys-X 2 ^s-X I2 -His-X,. 7 -His-X4 {SEQ w N0: 2) 
(also known as the "Cys2-His2" or "C2H2" consensus sequence). As zinc 
coordination provides the principal folding energy for zinc fingers, adjustment of zinc 
coordinating residues would appear to provide a ready means for modifying finger 



WO 02/057293 



PCT/US02/01893 



stability and structure, which could impact on a variety of important functional 

features of zinc finger protein - transcription factors. In particular, features such as 

cellular half-life, interactions with other cellular factors, DNA binding specificity and 

affinity, and relative orientation of functional domains would all be expected to be 

influenced by residue choice at the zinc-coordinating positions. 

Thus, in preferred embodiments, one or more zinc coordinating fingers 

making up the zinc finger protein has any of the following sequences: 

Xj-B-Xz^-Cys-Xn-His-Xt.r-His^ 
X 3 -Cys-X 2 ^-B-X, 2 -His-X,.7-His-X4 
X3-Cys-X 2J t-Cys-X 12 -Z-X,.7-His-X4 
X 3 -Cys-X w -Cys-X 12 -His-X,. 7 -Z-X4 
X 3 -B-X 2 ^-B-X 12 -His-X,. 7 -His-X4 
X 3 -B-X M -Cys-X, 2 -Z-X,. 7 -His-X4 
X 3 -B-X 2 ^-Cys-X 12 -His-X,. 7 -Z-X, 
15 X 3 -Cys-X M -B-X 12 -Z-X,. 7 -His-X, 

Xa-Cys-Xw-B-Xn-His-Xi-r-Z-X, 
X 3 -Cys-Xw-Cys-X, 2 -Z-X,. 7 -Z-X4 
X 3 -Cys-X 2 . 4 -B-X I2 -Z-X,. 7 -Z-X 4 
X 3 -B-X 2 ^-Cys-X, 2 -Z-X,. 7 -Z-X 4 
X 3 -B-X 2 ^-B-X 12 -His-X,. 7 -Z-X4 
X 3 -B-X 2J ,-B-X 12 -Z-X,. 7 -His-X, 
X 3 -B-X 2 ^-B-X 12 -Z-X,. 7 -Z-X4 
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25 where X= any amino acid 

B= any amino acid except cysteine 
Z= any amino acid except histidine 

Additionally, it is preferred that a zinc finger protein comprises at least three zinc 
30 coordinating fingers and that at least one of these fingers is non-canonical. In the standard 
nomenclature for ZFPs, the "first" finger is the N-terminal-most finger of the protein (with 
respect to the other fingers) and binds to the 3 '-most triplet (or quadruplet) subsite in the 
target site. Additional fingers, moving towards the C-terminus of the protein, are 
numbered sequentially. For example, in certain embodiments, a three-finger zinc finger 
35 protein is provided wherein the first two fingers are of the C2-H2 class but the first or 

second histidine residue in the third finger (and optionally adjacent amino acid residues) is 
substituted with Cys or with Cys and additional amino acids, such as glycine. In other 
embodiments, a three-finger zinc finger protein is provided wherein the first or second 
cysteine residue in the first finger is substituted with histidine or with histidine and 
40 additional amino acids such as glycine. Furthermore, in certain embodiments, a finger of a 
zinc finger protein is modified such that, in one or more of the fingers, one or more 
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cysteine or histidine residues is replaced with a different amino acid such as, for example, 
serine. In one embodiment, the second finger of a three-finger zinc finger protein is 
modified such that one or both of the cysteine residues are replaced with serine (and/or 
additional amino acids). Additionally, caiboxyl-containing amino acids, such as, for 
5 example, aspartic acid and glutamic acid are substituted for cysteine and/or histidine in a 
zinc finger. Furthermore, ZFPs comprising two or more fingers in which more than one 
finger is modified are also provided. 

Therefore, the ZFPs disclosed herein differ from previously described designed 
zinc finger protein transcription factors in that they comprise at least one zinc- 

10 coordinating finger that differs from the canonical consensus sequence (Cys-Cys-His-His). 
It will be readily apparent that various combinations of modified zinc fingers can be used 
in a single protein; for example, all of the finger components may be modified using the 
same or different modified zinc fingers. Alternatively, less than all of the fingers can be 
modified using the same or different modified fingers. Furthermore, the non-canonical 

1 5 modified finger components described herein can also be used in combination with 
previously described C2H2 ZFP finger components. 

In additional embodiments, the isolated non-canonical zinc fingers described herein 
are used in fusion proteins, for example fusions of a ZFP DNA-binding domain with 
repression or activation domains or with chromatin remodeling domains. Polynucleotides 

20 encoding any of the zinc finger proteins, components thereof and fusions thereof are also 
provided. 

The practice of the disclosed methods employs, unless otherwise indicated, 
conventional techniques in molecular biology, biochemistry, genetics, computational 
chemistry, cell culture, recombinant DNA and related fields as are within the skill of the 
25 art. These techniques are folly explained in the literature. See, for example, Sambrook et 
al MOLECULAR CLONING: A LABORATORY MANUAL, Second edition, Cold Spring Harbor 
Laboratory Press, 1 989; Ausubel et ai, CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, 
John Wiley & Sons, New York, 1987 and periodic updates; and the series METHODS IN 
ENZYMOLOGY, Academic Press, San Diego. 

30 Definitions 

The terms "nucleic acid," "polynucleotide," and "oligonucleotide" are used 
interchangeably and refer to a deoxyribonucleotide or ribonucleotide polymer in either 
single- or double-stranded form. For the purposes of the present disclosure, these terms 
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are not to be construed as limiting with respect to the length of a polymer. The terms can 
encompass known analogues of natural nucleotides, as well as nucleotides that are 
modified in the base, sugar and/or phosphate moieties. In general, an analogue of a 
particular nucleotide has the same base-pairing specificity; i.e., an analogue of A will base- 
5 pair with T. 

The terms polypeptide," "peptide" and "protein" are used interchangeably to refer 
to a polymer of amino acid residues. The term also applies to amino acid polymers in 
which one or more amino acids are chemical analogues or modified derivatives of a 
corresponding naturally occurring amino acid, for example selenocysteine (Bock et al 

10 (1991) Trends Biochem. Sci. 16:463467; Nasim et al (2000) J. Biol Chem. 275:14,846- 
14,852) and the like. 

A binding protein" is a protein that is able to bind non-covalently to another 
molecule. A binding protein can bind to, for example, a DNA molecule (a DNA-binding 
protein), an RNA molecule (an RNA-binding protein) and/or a protein molecule (a 

1 5 protein-binding protein). In the case of a protein-binding protein, it can bind to itself (to 
form homodimers, homotrimers, etc.) and/or it can bind to one or more molecules of a 
different protein or proteins. A binding protein can have more than one type of binding 
activity. For example, zinc finger proteins have DNA-binding, RNA-binding and protein- 
binding activity. A "binding profile" refers to a plurality of target sequences that are 

20 recognized and bound by a particular binding protein. For example, a binding profile can 
be determined by contacting a binding protein with a population of randomized target 
sequences to identify a sub-population of target sequences bound by that particular binding 
protein. 

A "zinc finger binding protein" is a protein or segment within a larger protein that 
25 binds DNA, RNA and/or protein in a sequence-specific manner as a result of stabilization 
of protein structure through coordination of a zinc ion. The term zinc finger binding 
protein is often abbreviated as zinc finger protein or ZFP. A "canonical" zinc finger refers 
to a zinc-coordinating component (e.g., zinc finger) of a zinc finger protein having the 
general amino acid sequence: Xa-Cys-Xa^-Cys-Xn-His-Xi.T-His^ where X is any amino 
30 acid (also known as a C2H2 zinc finger). 

A "modified" zinc finger protein is a protein not occurring in nature that has been 
designed and/or selected so as to comprise a substitution of at least one amino acid, 
compared to a naturally occurring zinc finger protein. Further, a "designed" zinc finger 
protein is a protein not occurring in nature whose structure and composition results 
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principally from rational criteria. Rational criteria for design include application f 
substitution rules and computerized algorithms for processing information in a database 
storing information of existing ZFP designs and binding data, for example as described in 
co-owned PCT WO 00/42219. A "selected" zinc finger protein is a protein not found in 
5 nature whose production results primarily from an empirical process such as phage 
display. See eg., US 5,789,538; U.S. 6,007,988; U.S. 6,013,453; WO 95/19431; 
WO 96/06166 and WO 98/5431 1. Designed and/or selected ZFPs are also referred to as 
"engineered" ZFPs and can be modified according to the methods and compositions 
disclosed herein (e.g., by conversion to C3H and/or to comprise a plant backbone). 

10 The term "naturally-occurring" is used to describe an object that can be found in 

nature, as distinct from being artificially produced by a human. 

A zinc finger "backbone" is the portion of a zinc finger outside the region involved 
in DNA major groove interactions; the regions of the zinc finger outside of residues -1 
through +6 of the alpha helix. The backbone comprises the beta strands, the connecting 

1 5 region between the second beta strand and the alpha helix, the portion of the alpha helix 
distal to the first conserved histidine residue, and the inter-finger linker sequence(s). 

Nucleic acid or amino acid sequences are "operably linked" (or "operatively 
linked") when placed into a functional relationship with one another. For instance, a 
promoter or enhancer is operably linked to a coding sequence if it regulates, or contributes 

20 to the modulation of, the transcription of the coding sequence. Operably linked DNA 
sequences are typically contiguous, and operably linked amino acid sequences are 
typically contiguous and in the same reading frame. However, since enhancers generally 
function when separated from the promoter by up to several kilobases or more and intronic 
sequences may be of variable lengths, some polynucleotide elements may be operably 

25 linked but not contiguous. Similarly, certain amino acid sequences that are non- 
contiguous in a primary polypeptide sequence may nonetheless be operably linked due to, 
for example folding of a polypeptide chain. 

With respect to fusion polypeptides, the term "operatively linked" can refer to the 
fact that each of the components performs the same function in linkage to the other 

30 component as it would if it were not so linked. For example, with respect to a fusion 

polypeptide in which a ZFP DNA-binding domain is fused to a transcriptional activation 
domain (or functional fragment thereof), the ZFP DNA-binding domain and the 
transcriptional activation domain (or functional fragment thereof) are in operative linkage 
if, in the fusion polypeptide, the ZFP DNA-binding domain portion is able to bind its 
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target site and/or its binding site, while the transcriptional activation domain (or functional 
fragment thereof) is able to activate transcription. 

"Specific binding" between, for example, a ZFP and a specific target site means a 
binding affinity of at least 1 x 10 6 M7 1 . 
5 A "fusion molecule" is a molecule in which two or more subunit molecules are 

linked, preferably covalently. The subunit molecules can be the same chemical type of 
molecule, or can be different chemical types of molecules. Examples of the first type of 
fusion molecule include, but are not limited to, fusion polypeptides (for example, a fusion 
between a ZFP DNA-binding domain and a transcriptional activation domain) and fusion 

10 nucleic acids (for example, a nucleic acid encoding the fusion polypeptide described 

herein). Examples of the second type of fusion molecule include, but are not limited to, a 
fusion between a triplex-forming nucleic acid and a polypeptide, and a fusion between a 
minor groove binder and a nucleic acid. 

A "gene," for the purposes of the present disclosure, includes a DNA region 

1 5 encoding a gene product (see below), as well as all DNA regions that regulate the 

production of the gene product, whether or not such regulatory sequences are adjacent to 
coding and/or transcribed sequences. Accordingly, a gene includes, but is not necessarily 
limited to, promoter sequences, terminators, translational regulatory sequences such as 
ribosome binding sites and internal ribosome entry sites, enhancers, silencers, insulators, 

20 boundary elements, replication origins, matrix attachment sites and locus control regions. 
Further, a promoter can be a normal cellular promoter or, for example, a promoter of an 
infecting microorganism such as, for example, a bacterium or a virus. For example, the 
long terminal repeat (LTR) of retroviruses is a promoter region that may be a target for a 
modified zinc finger binding polypeptide. Promoters from members of the Lentivirus 

25 group, which include such pathogens as human T-cell lymphotrophic virus (HTLV) 1 and 
2, or human immunodeficiency virus (HIV) 1 or 2, are examples of viral promoter regions 
which may be targeted for transcriptional modulation by a modified zinc finger binding 
polypeptide as described herein. 

"Gene expression" refers to the conversion of the information, contained in a gene, 

30 into a gene product. A gene product can be the direct transcriptional product of a gene 
{e.g., mRNA, tRNA, rRNA, antisense RNA, ribozyme, structural RNA or any other type 
of RNA) or a protein produced by translation of an mRNA. Gene products also include 
RNAs that are modified, by processes such as capping, polyadenylation, methylation, and 
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editing, and proteins modified by, for example, methylation, acetylation, phosphorylation, 
ubiquitination, ADP-ribosylation, myristilation, and glycosylation. 

"Gene activation" and "augmentation of gene expression" refer to any process that 
results in an increase in production of a gene product A gene product can be either RNA 
5 (including, but not limited to, mRNA, rRNA, tRNA, and structural RNA) or protein. 

Accordingly, gene activation includes those processes that increase transcription of a gene 
and/or translation of an mRNA. Examples of gene activation processes which increase 
transcription include, but are not limited to, those which facilitate formation of a 
transcription initiation complex, those which increase transcription initiation rate, those 

10 which increase transcription elongation rate, those which increase processivity of 

transcription and those which relieve transcriptional repression (by, for example, blocking 
the binding of a transcriptional repressor). Gene activation can constitute, for example, 
inhibition of repression as well as stimulation of expression above an existing level. 
Examples of gene activation processes that increase translation include those that increase 

15 translation^ initiation, those that increase translational elongation and those that increase 
mRNA stability. In general, gene activation comprises any detectable increase in the 
production of a gene product, preferably an increase in production of a gene product by 
about 2-fold, more preferably from about 2- to about 5-fold or any integral value 
therebetween, more preferably between about 5- and about 10-fold or any integral value 

20 therebetween, more preferably between about 1 0- and about 20-fold or any integral value 
therebetween, still more preferably between about 20- and about 50-fold or any integral 
value therebetween, more preferably between about 50- and about 100-fold or any integral 
value therebetween, more preferably 100-fold or more. 

"Gene repression" and "inhibition of gene expression" refer to any process that 

25 results in a decrease in production of a gene product A gene product can be either RNA 
(including, but not limited to, mRNA, rRNA, tRNA, and structural RNA) or protein. 
Accordingly, gene repression includes those processes that decrease transcription of a gene 
and/or translation of an mRNA. Examples of gene repression processes which decrease 
transcription include, but are not limited to, those which inhibit formation of a 

30 transcription initiation complex, those which decrease transcription initiation rate, those 
which decrease transcription elongation rate, those which decrease processivity of 
transcription and those which antagonize transcriptional activation (by, for example, 
blocking the binding of a transcriptional activator). Gene repression can constitute, for 
example, prevention of activation as well as inhibition of expression below an existing 
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level. Examples of gene repression processes that decrease translation include those that 
decrease translational initiation, those that decrease translational elongation and those that 
decrease mRNA stability. Transcriptional repression includes both reversible and 
irreversible inactivation of gene transcription. In general, gene repression comprises any 
5 detectable decrease in the production of a gene product, preferably a decrease in 

production of a gene product by about 2-fold, more preferably from about 2- to about 5- 
fold or any integral value therebetween, more preferably between about 5- and about 10- 
fold or any integral value therebetween, more preferably between about 10- and about 20- 
fold or any integral value therebetween, still more preferably between about 20- and about 

10 50-fold or any integral value therebetween, more preferably between about 50- and about 
100-fold or any integral value therebetween, more preferably 100-fold or more. Most 
preferably, gene repression results in complete inhibition of gene expression, such that no 
gene product is detectable. 

The term "modulate" refers to a change in the quantity, degree or extent of a 

1 5 function. For example, the modified zinc finger-nucleotide binding polypeptides disclosed 
herein may modulate the activity of a promoter sequence by binding to a motif within the 
promoter, thereby inducing, enhancing or suppressing transcription of a gene operatively 
linked to the promoter sequence. Alternatively, modulation may include inhibition of 
transcription of a gene wherein the modified zinc finger-nucleotide binding polypeptide 

20 binds to the structural gene and blocks DNA dependent RNA polymerase from reading 
through the gene, thus inhibiting transcription of the gene. The structural gene may be a 
normal cellular gene or aironcogene, for example. Alternatively, modulation may include 
inhibition of translation of a transcript. Thus, "modulation" of gene expression includes 
both gene activation and gene repression. 

25 Modulation can be assayed by determining any parameter that is indirectly or 

directly affected by the expression of the target gene. Such parameters include, e.g., 
changes in RNA or protein levels; changes in protein activity; changes in product levels; 
changes in downstream gene expression; changes in transcription or activity of reporter 
genes such as, for example, luciferase, CAT, beta-galactosidase, or GFP (see, e.g., Mistili 

30 & Spector, (1997) Nature Biotechnology 15:961-964); changes in signal transduction; 
changes in phosphorylation and dephosphorylation; changes in receptor-ligand 
interactions; changes in concentrations of second messengers such as, for example, cGMP, 
cAMP, IP3, and Ca2 + ; changes in cell growth, changes in neovascularization, and/or 
changes in any functional effect of gene expression. Measurements can be made in vitro, 

13 



WO 02/057293 



PCTAUS02/01893 



in vivo, and/or ex vivo. Such functional effects can be measured by conventional methods, 

measurement of RNA or protein levels, measurement of RNA stability, and/or 
identification of downstream or reporter gene expression. Readout can be by way o£ for 
example, chenanuminescence, fluorescence, colorimetric reactions, antibody binding, 
5 inducible markers, ligand binding assays; changes in intracellular second messengers such 
as cGMP and inositol triphosphate (IP3); changes in intracellular calcium levels; cytokine 
release, and the like. 

"Eucaryotic cells" include, but are not limited to, fungal cells (such as yeast), plant 
cells, animal cells, mammalian cells and human cells. Similarly, "prokaryotic cells' 

1 0 include, but are not limited to, bacteria. 

A Regulatory domain" or "functional domain" refers to a protein or a polypeptide 
sequence that has transcriptional modulation activity, or that is capable of interacting with 
proteins and/or protein domains that have transcriptional modulation activity. Typically, a 
functional domain is covalently or non-covalently linked to a ZFP to modulate 

1 5 transcription of a gene of interest. Alternatively, a ZFP can act, in the absence of a 
functional domain, to modulate transcription. Furthermore, transcription of a gene of 
interest can be modulated by a ZFP linked to multiple functional domains. 

A "functional fragment" of a protein, polypeptide or nucleic acid is a protein, 
polypeptide or nucleic acid whose sequence is not identical to the full-length protein, 

20 polypeptide or nucleic acid, yet retains the same function as the full-length protein, 

polypeptide or nucleic acid. A functional fragment can possess more, fewer, or the same 
number of residues as the corresponding native molecule, and/or can contain one ore more 
amino acid or nucleotide substitutions. Methods for determining the function of a nucleic 
acid (e.g., coding function, ability to hybridize to another nucleic acid) are well known in 

25 the art. Similarly, methods for determining protein function are well known. For example, 
the DNA-binding function of a polypeptide can be determined, for example, by filter- 
binding, electrophoretic mobility-shift, or immunoprecipitation assays. See Ausubel et at, 
supra. The ability of a protein to interact with another protein can be determined, for 
example, by co-immunoprecipitation, two-hybrid assays or complementation, both genetic 

30 and biochemical. See, for example, Fields et at (1989) Nature 340:245-246; U.S. Patent 
No. 5,585,245 and PCT WO 98/44350. 

A "target site" or "target sequence" is a sequence that is bound by a binding protein 
such as, for example, a ZFP. Target sequences can be nucleotide sequences (either DNA 
or RNA) or amino acid sequences. By way of example, a DNA target sequence for a 
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three-finger ZFP is generally either 9 or 10 nucleotides in length, depending upon the 
presence and/or nature of cross-strand interactions between the ZFP and die target 
sequence. Target sequences can be found in any DNA or RNA sequence, including 
regulatory sequences, exons, introns, or any non-coding sequence. 
5 A "target subsite" or "subsite" is the portion of a DNA target site lhat is bound by a 

single zinc finger, excluding cross-strand interactions. Thus, in the absence of cross-strand 
interactions, a subsite is generally three nucleotides in length. In cases in which a cross- 
strand interaction occurs (e.g., a "D-able subsite," as described for example in co-owned 
PCT WO 00/42219) a subsite is four nucleotides in length and overlaps with another 3- or 
10 4-nucleotide subsite. 

The term "effective amount" includes that amount which results in the desired 
result, for example, deactivation of a previously activated gene, activation of a previously 
repressed gene, or inhibition of transcription of a structural gene or translation of RNA. 

1 5 Zinc Finger Proteins 

Zinc finger proteins are formed from zinc finger components. For example, zinc 
finger proteins can have one to thirty-seven fingers, commonly having 2, 3, 4, 5 or 6 
fingers, Zinc finger DNA-binding proteins are described, for example, in Miller et al. 
(1985) EMBO J. 4:1609-1614; Rhodes et al. (1993) Scientific American Feb.:56-65; and 
Klug (1999) J. Mol. Biol. 293:215-218. A zinc finger protein recognizes and binds to a 
target site (sometimes referred to as a target segment) that represents a relatively small 
subsequence within a target gene. Each component finger of a zinc finger protein binds to 
a subsite within the target site. The subsite includes a triplet of three contiguous bases on 
the same strand (sometimes referred to as the target strand). The three bases in the subsite 
can be individually denoted the 5' base, the mid base, and the 3' base of the triplet, 
respectively. The subsite may or may not also include a fourth base on the non-target 
strand that is the complement of the base immediately 3' of the three contiguous bases on 
Ihe target strand. The base immediately 3' of the three contiguous bases on the target 
strand is sometimes referred to as the 3' of the 3' base. Alternatively, the four bases of the 
target strand in a four base subsite can be numbered 4, 3, 2, and 1, respectively, starting 
from the 5' base. 

In discussing the sperificity-determining regions of a zinc finger, amino acid +1 
refers to the first amino acid in the a-helical portion of the zinc finger. The portion of a 
zinc finger that is generally believed to be responsible for its binding specificity lies 
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between -1 and +<>. Amino acid ++2 refers to the amino acid at position +2 in a second 
zinc finger adjacent (in the C-terminal direction) to the zinc finger under consideration, hi 
certain circumstances, a zinc finger binds to its triplet subsite substantially independently 
of other fingers in (he same zinc finger protein. Accordingly, the binding specificity of a 
zinc finger protein containing multiple fingers is, to a first approximation, the aggregate of 
the specificities of its component fingers. For example, if a zinc finger protein is formed 
from first, second and third fingers that individually bind to triplets XXX, YYY, and ZZZ, 
the binding specificity of the zinc finger protein is 3'-XXX YYY ZZZ-5'. 

The relative order of fingers in a zinc finger protein, from N-terminal to C- 
tenninal, determines the relative order of triplets in the target sequence, in the 3' to 5' 
direction that will be recognized by the fingers. For example, if a zinc finger protein 
comprises, from N-terminal to C-terminal, first, second and third fingers that individually 
bind to the triplets 5 '-GAC-3', 5*-GTA-3' and 5'-GGC-3', respectively, then the zinc 
finger protein binds to the target sequence 5'-GGCGTAGAC-3' (SEQ ID NO: 3). If the 
zinc finger protein comprises the fingers in another order, for example, second finger, first 
finger, third finger, then the zinc finger protein binds to a target segment comprising a 
different permutation of triplets, in this example, 5'-GGCGACGTA-3' (SEQ ED NO: 4). 
See Berg et al. (1996) Science 271:1081-1086. 

A component finger of a zinc finger protein typically contains approximately 30 
amino acids and comprises the following canonical consensus sequence (from N to C): 
Cys-(X) 2 - r Cys-X12-His-(X) 3 .5-His (SEQ ID NO: 2) 
Thus, most C2H2 type zinc fingers contain two invariant cysteine residues in 
the beta turn and two invariant histidine residues, these four residues being 
coordinated through a zinc atom to maintain the characteristic zinc finger structure. 
See, e.g., Berg & Shi (1996) Science 271:1081-1085. The numbering convention 
used above is standard in the field for the region of a zinc finger conferring binding 
specificity. The amino acid on the N-teiminal side of the first invariant His residue is 
assigned the number +6, and other amino acids, proceeding in an N-terminal 
direction, are assigned successively decreasing numbers. The alpha helix begins at 
residue +1 and extends to the residue following the second conserved histidine. The 
entire helix is therefore of variable length, between 1 1 and 13 residues. 

Certain DNA-binding domains are capable of binding to DNA that is packaged in 
nucleosomes. See, for example, Cordingley era/. (1987) Cell 48:261-270; Pina etal. 
(1990) Cell 60:719-731; and CiriUo et al. (1998) EMBOJ. 17:244-254. Certain ZFP- 
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containing proteins such as, for example, members of the nuclear hormone receptor 
superfamily, are capable of binding DNA sequences packaged into chromatin. These 
include, but are not limited to, the glucocorticoid receptor and the thyroid hormone 
receptor. Archer et al (1992) Science 255:1573-1576; Wong etal. (1997) EMBO J. 
16:7130-7145. Other DNA-binding domains, including certain ZFP-containing binding 
domains, require more accessftle DNA for binding. In the latter case, the required binding 
specificity of the DNA-binding domain can be determined by identifying accessible 
regions in the cellular chromatin. Accessible regions can be detennined as described in 
co-owned International Publications WO 01/83751 and WO 01/83732). A modified ZFP 
DNA-binding domain is designed and/or selected to bind to a target site within the 
accessible region. 
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A. Non-Canonical ZFPs 

The compositions and methods disclosed herein include modified, preferably non- 
canonical (eg., non-C2H2), zinc finger proteins that specifically bind to a target sequence 
Non-canonical ZFP DNA-binding domains can be designed and/or selected to recognize a 
particular target site, for example as described in co-owned WO 00/42219; WO 00/41566- 
as well as U.S. Patents 5,789,538; 6,007,408; 6,013,453; 6,140,081 and 6,140,466- and' 
PCT publications WO 95/19431, WO 98/54311, WO 00/23464 and WO 00/27878. In 
preferred embodiments, the process of designing or selecting a non-canonical, non- 
naturally occurring ZFP typically starts with a natural ZFP as a source of framework 
residues, as described in co-owned PCT WO 00/42219; WO 98/53057; WO 98/53058- 
WO 98/53059 and WO 98/53060. 

Briefly, the methods disclosed herein serve to modify the typically invariant Cys 
and His residues while maintaining (or enhancing) the desired binding specificity of a 
ZFP. The process of obtaining a non-naturally occurring ZFP with a predetermined 
binding specificity typically starts with a natural ZFP as a source of framework residues 
The process of design or selection serves to define non-conserved positions (Le., positions 
-1 to +6) so as to confer a desired binding specificity. One ZFP suitable for use as a 
framework is the DNA-binding domain of the mouse transcription factor Zif268. Another 
smtable natural zinc finger protein as a source of framework residues is S P -1 The Sp-1 
sequence used for construction of zinc finger proteins corresponds to amino acids 53 1 to 
624 m the Sp-1 transcription factor. An additional useful ZFP backbone is that of the Sp-1 
consensus sequence, described by Shi et al. (1995) Chemistry and Biology 1:83-89 The 
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amino acid sequences of these ZFP frameworks are disclosed in co-owned PCT WO 
00/42219. In other aspects, the ZFP backbone will comprise a modified plant ZFP 
backbone into which one or more of the non-canonical fingers described herein are 
inserted so that they bind to a target sequence. Other suitable ZFPs are known to those of 
5 skill in the art and are described herein. The documents cited supra also disclose methods 
of assessing binding specificity of modified ZFPs. 

Non-canonical zinc fingers therefore include one or more zinc finger 
components in which at least one of the C2H2 amino acids has been replaced with 
one or more amino acids. In certain embodiments, more than one of the canonical 
10 amino acids is replaced. Examples of non-canonical zinc finger components include: 

X3-B-X2^-Cys-X, 2 -His-X,.7-His-X4 
Xj-Cys-Xj^-B-Xn-His-Xur-His-X, 
X3-Cys-X 2 ^-Cys-X 12 -Z-X,. 7 -His-X4 
Xj-Cys-XM-Cys-Xij-His-Xj^-Z-X, 
15 X 3 -B-X w -B-X, 2 -His-X,.7-His-X4 

X3-B-X 2 ^-Cys-X, 2 -Z-X,. 7 -His-X, 
X3-B-X M -Cys-X, 2 -His-Xi.7-Z-X, 
X 3 -Cys-X 2 ^-B-X, 2 -Z-Xi. 7 -His-X4 
X3-Cys-X M -B-X 12 -His-X,.7-Z-X4 
X 3 -Cys-X M -Cys-Xi 2 -Z-X,. 7 -Z-X4 
Xj-Cys-XM-B-Xn-Z-Xj.T-Z-X* 
X 3 -B-X 2 ^-Cys-X 12 -Z-X,.7-Z-X4 
X 3 -B-X 2 ^-B-X, 2 -His-X,. 7 -Z-X, 
X 3 -B-X M -B-X 12 -Z-X,. 7 -His-X4 
25 X 3 -B-X w -B-X, 2 -Z-Xi. 7 -Z-X4 

X 3 -Y-X w -Cys-X 12 -ffis-X,. 7 -His-X, 
X 3 -Cys-X 2 ^-Y-X, 2 -His-X,. 7 -His-X, 
XrC^s-XM-Cys-Xu-Y-X^-His-^ 
X 3 -Cys-X M -Cys-X 12 -His-X,. 7 -Y-X, 
30 X 3 -Y-X M -Y-X 12 -His-X 1 . 7 -His-X 4 

X 3 -Y-X M -Cys-X 12 -Y-X,. 7 -His-X4 
X 3 -Y-X M -Cys-X, 2 -His-X,. 7 -Y-Xi 
X 3 -Cys-X M -Y-X I2 -Y-X,. 7 -His-X4 
Xj-tys-X^-Y-Xu-His-X^-Y-X, 
35 Xj-Cys-Xj^-Cys-Xrz-Y-XLT-Y-X, 

Xj-Cys-XM-Y-Xn-Y-XuT-Y-X, 
X 3 -Y-X 2 _4-Cys-X 12 -Y-X,. 7 -Y-X4 
X 3 -Y-X M -Y-X I2 -His-X,. 7 -Y-X4 
X 3 -Y-X M -Y-X I2 -Y-X,. 7 -His-X4 
X 3 -Y-X 2 ^-Y-X 12 -Y-X,. 7 -Y-X, 
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where X= any amino acid 

B= any amino acid except cysteine 
Z= any amino acid except histidine 
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Y= any amino acid except histidine or cysteine 

A modified ZFP can include any number of zinc finger components, although a 
three-finger structure is generally preferred. Typically, the C-termmal-most (e.g„ third) 
5 finger of the ZFP is modified and non-canonical. The other fingers of the protein may be 
naturally occurring zinc finger components, non-canonical modified components, modified 
C2H2 fingers or combinations of these components. Thus, as described below in Example 
2, in certain embodiments, a three-finger zinc finger binding protein is provided wherein 
the first two fingers are of the C2-H2 class and, in the third (C-terminal-most) finger, the 

10 second histidine is substituted with Cys or with Cys and additional amino acids, such as 
glycine. In other embodiments, a three-finger zinc finger protein is provided wherein, in 
the first (N4erminal-most) finger, the first cysteine residue is substituted with histidine or 
with histidine and additional amino acids such as glycine. Furthermore, in certain 
embodiments, the second (middle) finger of a three-finger ZFP is modified such that one 

15 or both of the cysteines are replaced with serines (and/or additional amino acids). 

Also included herein are nucleic acids encoding a ZFP comprising at least one non- 
canonical zinc finger as described herein. 



20 



B. Linkage 

Two or more zinc finger proteins can be linked to have a target site specificity that 
is, to a first approximation, the aggregate of that of the component zinc finger proteins. 
For example, a first zinc finger protein having first, second and third component fingers 
that respectively bind to XXX, YYY and ZZZ can be linked to a second zinc finger protein 
having first, second and third component fingers with binding specificities, AAA, BBB 
25 and CCC. The binding specificity of the combined first and second proteins is thus 
5'-CCCBBBAAANZZZYYYXXX-3', where N indicates a short intervening region 
(typically 0-5 bases of any type). In this situation, the target site can be viewed as 
comprising two target segments separated by an intervening segment. 

Linkage of zinc finger proteins can be accomplished using any of the following 
30 peptide linkers: 

TGEKP (SEQ ID NO: 5) Liu et al (1997) Proc. Natl Acad. Set USA 94:5525- 

5530. 

{GS\ (SEQ ID NO: 6) Kim et al. (1996) Proc. Natl Acad. Sci. USA 93:il56- 

1160. 
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GGRRGGGS (SEQ ID NO: 7) 

LRQRDGERP (SEQ ID NO: 8) 

LRQKDGGGSERP (SEQ ID NO: 9) 

LRQKD(G 3 S)2ERP (SEQ ID NO: 10). 
5 Alternatively, flexible linkers can be rationally designed using computer programs 

capable of modeling both DNA-binding sites and the peptides themselves, or by phage 
display methods. In a further variation, non-covalent linkage can be achieved by fusing 
two zinc finger proteins with domains promoting heterodimer formation of the two zinc 
finger proteins. For example, one zinc finger protein can be fused with/os and the other 
10 within (see Barbas et aL 9 WO 95/1 19431). Alternatively, dimerization interfaces can be 
obtained by selection. See, for example, Wang et al (1999) Proc. Natl. Acad. Set USA 
96:9568-9573. 

Linkage of two or more zinc finger proteins is advantageous for conferring a 
unique binding specificity within a mammalian genome. A typical mammalian diploid 

1 5 genome consists of 3 x 1 0 9 bp. Assuming that the four nucleotides A, C, G, and T are 

randomly distributed, a given 9 bp sequence is present -23,000 times. Thus a three-finger 
ZFP recognizing a 9 bp target with absolute specificity would have the potential to bind to 
-23,000 sites within the genome. An 18 bp sequence is present once in 3.4 x 10 l0 bp, or 
about once in a random DNA sequence whose complexity is ten times that of a 

20 mammalian genome. Thus, linkage of two three-finger ZFPs, to recognize an 18 bp target 
sequence, provides the requisite specificity to target a unique site in a typical mammalian 
genome. 



C. Fusion Molecules 

25 The selection and/or design of non-canonical zinc finger-containing proteins 

also allows for the design of fusion molecules that facilitate regulation of gene 
expression. Thus, in certain embodiments, the compositions and methods disclosed 
herein involve fusions between at least one of the zinc finger proteins described 
herein (or functional fragments thereof) and one or more functional domains (or 

30 functional fragments thereof), or a polynucleotide encoding such a fusion. The 

presence of such a fusion molecule in a cell allows a functional domain to be brought 
into proximity with a sequence in a gene that is bound by the zinc finger portion of 
the fusion molecule. The transcriptional regulatory function of the functional domain 
is then able to act on the gene, by, for example, modulating expression of the gene. 

20 
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In certain embodiments, fusion proteins comprising a modified zinc finger 
DNA-binding domain and a functional domain are used for modulation of 
endogenous gene expression as described, for example, in co-owned PCT WO 
00/41566. Modulation includes repression and activation of gene expression; the 
5 nature of tbe modulation generally depending on the type of functional domain 
present in the fusion protein. Any polypeptide sequence or domain capable of 
influencing gene expression (or functional fragment thereof) that can be fused to a 
DNA-binding domain, is suitable for use. 

An exemplary functional domain for fusing with a ZFP DNA-binding domain, 

10 to be used for repressing gene expression, is a KRAB repression domain from the 
human KOX-1 protein (see, e.g., Thiesen et aL, New Biologist 2, 363-374 (1990); 
Margolin et aL, Proc. Natl. Acad. Sci. USA 91, 4509-4513 (1994); Pengue et aL, 
Nucl. Acids Res. 22:2908-2914 (1994); Witzgall et aL, Proc. Nad. Acad. Sci. USA 
91, 4514-4518 (1994). Another suitable repression domain is methyl binding domain 

15 protein 2B (MBD-2B) (see, also Hendrich et al. (1999) Mamm Genome 10:906-912 
for description of MBD proteins). Another useful repression domain is that 
associated with the v-ErbA protein. See, for example, Damm, et al. (1 989) Nature 
339:593-597; Evans (1989) Int. J. Cancer Suppl. 4:26-28; Pain et al. (1990) New 
Biol. 2:284-294; Sap et al. (1989) Nature 340:242-244; Zenke et al. (1988) Cell 

20 52:107-119; and Zenke etal. (1990) Cell 61:1035-1049. Additional exemplary 
repression domains include, but are not limited to, thyroid hormone receptor (TR), 
SID, MBD1, MBD2, MBD3, MBD4, MBD-like proteins, members of the DNMT 
family {e.g., DNMT1, DNMT3A, DNMT3B), Rb, MeCPl and MeCP2. See, for 
example, Zhang et al. (2000) Ann Rev Physiol 62:439-466; Bird et al. (1999) Cell 

25 99:451-454; Tyler et al. (1999) Cell 99:443-446; Knoepfler et al. (1999) Cell 
99:447-450; and Robertson et al. (2000) Nature Genet. 25:338-342. Additional 
exemplary repression domains include, but are not limited to, ROM2 and AtHD2A. 
See, for example, Chem et al. (1996) Plant Cell 8:305-321; and Wu et al. (2000) 
Plant J. 22:19-27. 

30 Suitable domains for achieving activation include the HS V VP 1 6 activation 

domain (see, e.g., Hagmann et al., J. Virol. 71, 5952-5962 (1997)) nuclear hormone 
receptors (see, e.g., Torchia et al., Curr. Opin. Cell. Biol. 10:373-383 (1998)); the p65 
subunit of nuclear factor kappa B (Bitko & Barik, J. Virol. 72:5610-5618 (1998) and 
Doyle & Hunt, Neuroreport 8:2937-2942 (1997)); Liu et al., Cancer Gene Ther. 5:3- 
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28 (1998)), or artificial chimeric functional domains such as VP64 (Seifpal et al., 
EMBO J. 1 1, 4961-4968 (1992)). 

Additional exemplary activation domains include, but are not limited to, p300, 
CBP, PCAF,SRC1 PvALF, AtHD2A and ERF-2. See, for example, Robyr et al 
5 (2000) Mol Endocrinol 14:329-347; Collingwood et al (1999) J, Mol Endocrinol 
23:255-275; Leo et al (2000) Gene 245:1-11; Manteuffel-Cymborowska (1999) 
Acta Biochim. Pol 46:77-89; McKenna et al (1999) /. Steroid Biochem. Mol Biol 
69:3-12; Malik et al (2000) Trends Biochem. Set 25:277-283; and Lemon et al 
(1999) Curr. Opin. Genel Dev. 9:499-504. Additional exemplary activation domains 
10 include, but are not limited to, OsGAI, HALF-1, CI, API, ARF-5, -6, -7, and -8, 
CPRF1, CPRF4, MYC-RP/GP, and TRAB1. See, for example, Ogawa et al (2000) 
Gene 245:21-29; Okanami et al (1996) Genes Cells 1:87-99; Goff et al (1991) 
Genes Dev. 5:298-309; Cho et al (1999) Plant Mol Biol 40:419-429; Ulmason et 
al (1999) Proc. Natl. Acad. Sci. USA 96:5844-5849; Sprenger-Haussels et al (2000) 
15 Plant J. 22:1-8; Gongetal. (1999) Plant Mol Biol 41:33-44; andHoboefa/. 
(1999) Proc. Natl Acad. Sci. USA 96:15,348-15,353. 

Additional functional domains are disclosed, for example, in co-owned 
WO 00/41566. Further, insulator domains, chromatin remodeling proteins such as 
ISWI-containing domains and/or methyl binding domain proteins suitable for use in 
20 fusion molecules are described, for example, in co-owned International Publication 
WO 01/83793 and PCT/US0 1/42377. 

In additional embodiments, targeted remodeling of chromatin, as disclosed in 
co-owned International Publication WO 01/83793 can be used to generate one or 
more sites in cellular chromatin that are accessible to the binding of a functional 
25 domain/modified ZFP fusion molecule. 

Fusion molecules are constructed by methods of cloning and biochemical 
conjugation that are well known to those of skill in the art Fusion molecules 
comprise a modified ZFP binding domain and, for example, a transcriptional 
activation domain, a transcriptional repression domain, a component of a chromatin 
30 remodeling complex, an insulator domain or a functional fragment of any of these 
domains. In certain embodiments, fusion molecules comprise a non-canonical zinc 
finger protein and at least two functional domains (e.g. t an insulator domain or a 
methyl binding protein domain and, additionally, a transcriptional activation or 
repression domain). Fusion molecules also optionally comprise nuclear localization 
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signals (such as, for example, that from the SV40 medium T-antigen) and epitope tags 
(such as, for example, FLAG, see Example 2, and hemagglutinin). Fusion proteins 
(and nucleic acids encoding them) are designed such that the translational reading 
frame is preserved among the components of the fusion. 
5 The fusion molecules disclosed herein comprise a non-canonical zinc finger 

binding protein which binds to a target site. In certain embodiments, the target site is 
present in an accessible region of cellular chromatin. Accessible regions can be 
determined as described in co-owned International Publications WO 0 1/8375 1 and 
WO 01/83732. If the target site is not present in an accessible region of cellular 
10 chromatin, one or more accessible regions can be generated as described in co-owned 
International Publication WO 01/83793. In additional embodiments, the non- 
canonical zinc finger component of a fusion molecule is capable of binding to cellular 
chromatin regardless of whether its target site is in an accessible region or not. For 
example, a modified ZFP as disclosed herein can be capable of binding to linker DNA 
15 and/or to nucleosomal DNA. Examples of this type of "pioneer" DNA binding 
domain are found in certain steroid receptor and in hepatocyte nuclear factor 3 
(HNF3). Cordingleyefa/. (1987) Cell 48:261-270; Pina et al (1990) Cell 60:719- 
731; andCirilloefa/. (199%) EMBO J. 17:244-254. 

Methods of gene regulation using a functional domain, targeted to a specific 
20 sequence by virtue of a fused DNA binding domain, can achieve modulation of gene 
expression. Genes so modulated can be endogenous genes or exogenous genes. 
Modulation of gene expression can be in the form of repression (e.g., repressing 
expression of exogenous genes, for example, when the target gene resides in a 
pathological infecting microorganism, or repression of an endogenous gene of the 
25 subject, such as an oncogene or a viral receptor, that contributes to a disease state). 
As described herein, repression of a specific target gene can be achieved by using a 
fusion molecule comprising a non-canonical zinc finger protein and a functional 
domain. 

Alternatively, modulation can be in the form of activation, if activation of a 
30 gene (e.g., a tumor suppressor gene or a transgene) can ameliorate a disease state. In 
this case, cellular chromatin is contacted with any of the fusion molecules described 
herein, wherein the modified zinc finger portion of the fusion molecule is specific for 
the target gene. The fimctional domain (e.g, insulator domain, activation domain, 
etc.) enables increased and/or sustained expression of the target gene. 
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For any such applications, the fusion molecules) can be formulated with a 
pharmaceutical^ acceptable carrier, as is known to those of skill in the art See, for 
example, Remington's Pharmaceutical Sciences, 17 th ed, 1985; and co-owned WO 
00/42219. 



Polynucleotide and Polypeptide Delivery 

The compositions described herein can be provided to the target cell in vitro or 
in vivo. In addition, the compositions can be provided as polypeptides, 
polynucleotides or combination thereof. 



10 



20 



30 



A. Deliver y of Polynucleotides 

In certain embodiments, the compositions are provided as one or more 
polynucleotides. Further, as noted above, a non-canonical zinc finger protein- 
containing composition can be designed as a fusion between a polypeptide zinc finger 
15 and a functional domain that is encoded by a fusion nucleic acid. In both fusion and 
non-fusion cases, the nucleic acid can be cloned into intermediate vectors for 
transformation into prokaryotic or eukaryotic cells for replication and/or expression. 
Intermediate vectors for storage or manipulation of the nucleic acid or production of 
protein can be prokaryotic vectors, (eg., plasmids), shuttle vectors, insect vectors, or 
viral vectors for example. A nucleic acid encoding a non-canonical zinc finger 
protein can also cloned into an expression vector, for administration to a bacterial cell, 
fungal cell, protozoal cell, plant cell, or animal cell, preferably a mammalian cell, 
more preferably a human cell. 

To obtain expression of a cloned nucleic acid, it is typically subcloned into an 
expression vector that contains a promoter to direct transcription. Suitable bacterial 
and eukaryotic promoters are well known in the art and described, e.g., in Sambrook 
etol., supra; Ausubel ei al., supra; and Kriegler, Gene Transfer and Expression: A 
Laboratory Manual (1990). Bacterial expression systems are available in, e.g., E. 
coli, Bacillus sp., and Salmonella. Palva et al. (1983) Gene 22:229-235. Kits for 
such expression systems are commercially available. Eukaryotic expression systems 
for mammalian cells, yeast, and insect cells are well known in the art and are also 
commercially available, for example, from Invitrogen, Carlsbad, CA and Clontech, 
Palo Alto, CA. 
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The promoter used to direct expression of the nucleic acid of choice depends 
on the particular application. For example, a strong constitutive promoter is typically 
used for expression and purification. In contrast, when a protein is to be used in vivo, 
either a constitutive or an inducible promoter is used, depending on the particular use 
5 of the protein. In addition, a weak promoter can be used, such as HS V TK or a 
promoter having similar activity. The promoter typically can also include elements 
that are responsive to transactivation, e,g., hypoxia response elements, Gal4 response 
elements, lac repressor response element, and small molecule control systems such as 
tet-regulated systems and the RU-486 system. See, e.g., Gossen et al. (1992) Proc. 

10 Natl. Acad. Sci USA 89:5547-5551; Oligino et a/.(1998) Gene Ther. 5:491-496; 
Wang et al. (1997) Gene Then 4:432-441; Neering et al. (1996) Blood 88:1 147- 
1 155; and Rendahl et al. (1998) Nat. Biotechnol. 16:757-761. 

In addition to a promoter, an expression vector typically contains a 
transcription unit or expression cassette that contains additional elements required for 

15 the expression of the nucleic acid in host cells, either prokaryotic or eukaryotic. A 
typical expression cassette thus contains a promoter operably linked, ag., to the 
nucleic acid sequence, and signals required, e.g., for efficient polyadenylation of the 
transcript, transcriptional termination, ribosome binding, andVor translation 
termination. Additional elements of the cassette may include, e.g., enhancers, and 

20 heterologous spliced intronic signals. 

The particular expression vector used to transport the genetic information into 
the cell is selected with regard to the intended use of the resulting ZFP polypeptide, 
e.g., expression in plants, animals, bacteria, fungi, protozoa etc. Standard bacterial 
expression vectors include plasmids such as pBR322, pBR322-based plasmids, pSKF, 

25 pET23D, and commercially available fusion expression systems such as GST and 
LacZ. Epitope tags can also be added to recombinant proteins to provide convenient 
methods of isolation, for monitoring expression, and for monitoring cellular and 
subcellular localization, e.g., c-myc or FLAG. 

Expression vectors containing regulatory elements from eukaryotic viruses are 

30 often used in eukaryotic expression vectors, e.g., SV40 vectors, papilloma virus 
vectors, and vectors derived from Epstein-Barr virus. Other exemplary eukaryotic 
vectors include pMSG, pAV009/A+, pMTO10/A+, pMAMneo-5, baculovirus 
pDSVE, and any other vector allowing expression of proteins under the direction of 
the SV40 early promoter, SV40 late promoter, metallothionein promoter, murine 
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mammary tumor virus promoter, Rous sarcoma virus promoter, polyhedrin promoter, 
or other promoters shown effective for expression in eukaryotic cells. 

Some expression systems have markers for selection of stably tiansfected cell 
lines such as thymidine kinase, hygromycin B phosphotransferase, and dihydrofolate 
5 reductase. High-yield expression systems are also suitable, such as baculovirus 
vectors in insect cells, with a nucleic acid sequence coding for a ZFP as described 
herein under the transcriptional control of the polyhedrin promoter or any other strong 
baculovirus promoter. 

Elements that are typically included in expression vectors also include a 

1 0 replicon that functions in E. coli (or in the prokaryotic host, if other than E. coli), a 
selective marker, eg., a gene encoding antibiotic resistance, to permit selection of 
bacteria that harbor recombinant plasmids, and unique restriction sites in nonessential 
regions of the vector to allow insertion of recombinant sequences. 

Standard transfection methods can be used to produce bacterial, mammalian, 

15 yeast, insect, or other cell lines that express large quantities of non-canonical zinc 

finger proteins, which can be purified, if desired, using standard techniques. See, e.g, 
Colley et al (1989) J. Biol Chem. 264:17619-17622; and Guide to Protein 
Purification, in Methods in Enzymology, vol. 182 (Deutscher, ed.) 1990. 
Transformation of eukaryotic and prokaryotic cells is performed according to standard 

20 techniques. See, e.g., Morrison (1977)/. BacierioL 132:349-351; Clark-Curtiss et al 
(1983) in Methods in Enzymology 101:347-362 (Wu et al, eds). 

Any procedure for introducing foreign nucleotide sequences into host cells can 
be used. These include, but are not limited to, the use of calcium phosphate 
transfection, DEAE-dextran-mediated transfection, polybrene, protoplast fusion, 

25 electroporation, lipid-mediated delivery (e..g., liposomes), microinjection, particle 
bombardment, introduction of naked DNA, plasmid vectors, viral vectors (both 
episomal and integrative) and any of the other well known methods for introducing 
cloned genomic DNA, cDNA, synthetic DNA or other foreign genetic material into a 
host cell (see, eg., Sambrook et al, supra). It is only necessary that the particular 

3 0 genetic engineering procedure used be capable of successfully introducing at least one 
gene into the host cell capable of expressing the protein of choice. 

Conventional viral and non-viral based gene transfer methods can be used to 
introduce nucleic acids into mammalian cells or target tissues. Such methods can be 
used to administer nucleic acids encoding reprogramming polypeptides to cells in 
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vitro. Preferably, nucleic acids are administered for in vivo or ex vivo gene therapy 
uses. Non-viral vector delivery systems include DNA plasmids, naked nucleic acid, 
and nucleic acid complexed with a delivery vehicle such as a liposome. Viral vector 
delivery systems include DNA and RNA viruses, which have either episomal or 
5 integrated genomes after delivery to the cell. For reviews of gene therapy procedures, 
see, for example, Anderson (1992) Science 256:808-813; Nabel et al. (1993) Trends 
Biotechnol. 11:211-217; Mitanie/a/. (1993) Trends Biotechnol. 11:162-166; Dillon 
(1993) Trends Biotechnol. 11:167-175; Miller (1992) Nature 357:455-460; Van 
Brunt (m%) Biotechnology 6(10):1 149-1 154; Vigne (1995) Restorative Neurology 
10 and Neuroscience 8:35-36; Kxemaetal. (1995) British Medical Bulletin 51(1):31- 
44; Haddada et al., in Current Topics in Microbiology and Immunology, Doerfler and 
B6hm (eds), 1995; and Yu et al. (1994) Gene Tlierapy 1:13-26. 

Methods of non-viral delivery of nucleic acids include lipofection, 
microinjection, ballistics, virosomes, liposomes, immunoliposomes, polycation or 
15 lipid:nucleic acid conjugates, naked DNA, artificial virions, and agent-enhanced 
uptake of DNA. Lipofection is described in, e.g., U.S. Patent Nos. 5,049,386; 
4,946,787; and 4,897,355 and lipofection reagents are sold commercially (e.g., 
Transfectam™ and Lipofectin^. Cationic and neutral lipids that are suitable for 
efficient receptor-recognition lipofection of polynucleotides include those of Feigner, 
20 WO 91/17424 and WO 91/16024. Nucleic acid can be delivered to cells (ex vivo 
administration) or to target tissues (in vivo administration). 

The preparation of lipid:nucleic acid complexes, including targeted liposomes 
such as immunolipid complexes, is well known to those of skill in the art. See, e.g, 
Crystal (1995) Science 270:404-410; Blaese et al. (1995) Cancer Gene Ther. 2:291- 
25 297; Behr et al. (1994) Bioconjugate Chem. 5:382-389; Remy et al. (1994) 

BioconjugateChem. 5:647-654; Gaoefa/. (1995) Gene Therapy 2:710-722; Ahmad 
etal. (1992) Cancer Res. 52:4817-4820; and U.S. Patent Nos. 4,186,183; 4,217,344; 
4,235,871; 4,261,975; 4,485,054; 4,501,728; 4,774,085; 4,837,028 and 4,946,787. 
The use of RNA or DNA virus-based systems for the delivery of nucleic acids 
30 take advantage of highly evolved processes for targeting a virus to specific cells in the 
body and trafficking the viral payload to the nucleus. Viral vectors can be 
administered directly to patients (in vivo) or they can be used to treat cells in vitro, 
wherein the modified cells are administered to patients (ex vivo). Conventional viral 
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based systems for the delivery of ZFPs include retroviral, lentiviral, poxviral, 
adenoviral, adeno-associated viral, vesicular stomatitis viral and herpesviral vectors. 
Integration in the host genome is possible with certain viral vectors, including the 
retrovirus, lentivirus, and adeno-associated virus gene transfer methods, often 
5 resulting in long term expression of the inserted transgene. Additionally, high 

transduction efficiencies have been observed in many different cell types and target 
tissues. 

The tropism of a retrovirus can be altered by incorporating foreign envelope 
proteins, allowing alteration and/or expansion of the potential target cell population. 

10 Lentiviral vectors are retroviral vector that are able to transduce or infect non-dividing 
cells and typically produce high viral titers. Selection of a retroviral gene transfer 
system would therefore depend on the target tissue. Retroviral vectors have a 
packaging capacity of up to 6-10 kb of foreign sequence and are comprised of ex- 
acting long terminal repeats (LTRs). The minimum ris-acting LTRs are sufficient for 

15 replication and packaging of the vectors, which are then used to integrate the 
therapeutic gene into the target cell to provide permanent transgene expression. 
Widely used retroviral vectors include those based upon murine leukemia virus 
(MuLV), gibbon ape leukemia virus (GaLV), simian immunodeficiency virus (SIV), 
human immunodeficiency virus (HIV), and combinations thereof. Buchscher et al. 

20 (1992) /. Virol 66:2731-2739; Johann et al. (1992) J. Virol 66:1635-1640; 

Sommerfelte/a/. (1990) Virol 176:58-59;. Wilson al (1989) J. Virol 63:2374- 
2378; Millers a/. (1991)/. Virol 65:2220-2224; and PCT/US94/05700). 

Adeno-associated virus (AAV) vectors are also used to transduce cells with 
target nucleic acids, e.g., in the in vitro production of nucleic acids and peptides, and 

25 for in vivo and ex vivo gene therapy procedures. See, e.g., West et al (1987) Virology 
160:38-47; U.S. Patent No. 4,797,368; WO 93/24641; Kotin (1994) Hum. Gene 
Ther. 5:793-801; and Muzyczka (1994)/. Clin. Invest. 94:1351. Construction of 
recombinant AAV vectors are described in a number of publications, including U.S. 
Patent No. 5,173,414; Tratschin et al (1985) Mol. Cell. Biol 5:3251-3260; 

30 Tratschin, et al. (1984) Mol Cell Biol 4:2072-2081; Hermonat et al (1984) Proc. 
Natl Acad. Sci. USA 81:6466-6470; and Samulski etal (1989)/. Virol 63:3822- 
3828. 

Recombinant adeno-associated virus vectors based on the defective and 
nonpathogenic parvovirus adeno-associated virus type 2 (AAV-2) are a promising 
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gene delivery system. Exemplary AAV vectors are derived from a plasmid 
containing the AAV 145 bp inverted terminal repeats flanking a transgene expression 
cassette. Efficient gene transfer and stable transgene delivery due to integration into 
the genomes of the transduced cell are key features for this vector system. Wagner et 
5 al (1998) Lancet 351©(9117):1702-3; and Kearns et al. (1996) Gene Ther. 9:748- 
55. 

pLASN and MFG-S are examples are retroviral vectors that have been used in 
clinical trials. Dunbar et al. (1995) Blood 85:3048-305; Kohn et al. (1995) Nature Med. 
1:1017-102; Malech et al. (1997) Proc. Natl Acad. ScL USA 94:12133-12138. 
10 PA3 17/pLASN was the first therapeutic vector used in a gene therapy trial. (Blaese et al. 
(1995) Science 270:475-480. Transduction efficiencies of 50% or greater have been 
observed for MFG-S packaged vectors. Ellem et al (1997) Immunol Immunother. 
44(l):10-20; Dranoff er a/. (1997)#«m. Gene Ther, 1:111-2. 

In applications for which transient expression is preferred, adenoviral-based 

1 5 systems are useful. Adenoviral based vectors are capable of very high transduction 
efficiency in many cell types and are capable of infecting, and hence delivering 
nucleic acid to, both dividing and non-dividing cells. With such vectors, high titers 
and levels of expression have been obtained. Adenovirus vectors can be produced in 
large quantities in a relatively simple system. 

20 Replication-deficient recombinant adenovirus (Ad) vectors can be produced at 

high titer and they readily infect a number of different cell types. Most adenovirus 
vectors are engineered such that a transgene replaces the Ad Ela, Elb, and/or E3 
genes; the replication defector vector is propagated in human 293 cells that supply the 
required El functions in trans. Ad vectors can transduce multiple types of tissues in 

25 vivo, including non-dividing, differentiated cells such as those found in the liver, 
kidney and muscle. Conventional Ad vectors have a large carrying capacity for 
inserted DNA. An example of the use of an Ad vector in a clinical trial involved 
polynucleotide therapy for antitumor immunization with intramuscular injection. 
Sterman et al. (1998) Hum. Gene Ther. 7:1083-1089. Additional examples of the use 

30 of adenovirus vectors for gene transfer in clinical trials include Rosenecker et al 
(1996) Infection 24:5-10; Sterman et al, supra; Welsh et al (1995) Hum. Gene 
Ther. 2:205-218; Alvarez et al (1997) Hum. Gene Ther. 5:597-613; andTopf etal 
(1998) Gene Ther. 5:507-513. 
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Packaging cells are used to form virus particles that are capable of infecting a 
host cell. Such cells include 293 cells, which package adenovirus, and ¥2 cells or 
PA3 17 cells, which package retroviruses.' Viral vectors used in gene therapy are 
usually generated by a producer cell line that packages a nucleic acid vector into a 
5 viral particle. The vectors typically contain the rninimal viral sequences required for 
packaging and subsequent integration into a host, other viral sequences being replaced 
by an expression cassette for the protein to be expressed. Missing viral functions are 
supplied in trans, if necessary, by the packaging cell line. For example, AAV vectors 
used in gene therapy typically only possess ITR sequences from the AAV genome, 

1 0 which are required for packaging and integration into the host genome. Viral DNA is 
packaged in a cell line, which contains a helper plasmid encoding the other AAV 
genes, namely rep and cap, but lacking ITR sequences. The cell line is also infected 
with adenovirus as a helper.. The helper virus promotes replication of the AAV vector 
and expression of AAV genes from the helper plasmid. The helper plasmid is not 

15 packaged in significant amounts due to a lack of ITR sequences. Contamination with 
adenovirus can be reduced by, e.g., heat treatment, which preferentially inactivates 
adenoviruses. 

In many gene therapy applications, it is desirable that the gene therapy vector 
be delivered with a high degree of specificity to a particular tissue type. A viral 
20 vector can be modified to have specificity for a given cell type by expressing a ligand 
as a fusion protein with a viral coat protein on the outer surface of the virus. The 
ligand is chosen to have affinity for a receptor known to be present on the cell type of 
interest. For example, Han et al. (1995) Proc. Natl. Acad. Sci. USA 92:9747-9751 
reported that Moloney murine leukemia virus can be modified to express human 
25 heregulin fused to gp70, and the recombinant virus infects certain human breast 

cancer cells expressing human epidermal growth factor receptor. This principle can 
be extended to other pairs of virus expressing a ligand fusion protein and target cell 
expressing a receptor. For example, filamentous phage can be engineered to display 
antibody fragments (e.g., F ab or F v ) having specific binding affinity for virtually any 
chosen cellular receptor. Although the above description applies primarily to viral 
vectors, the same principles can be applied to non-viral vectors. Such vectors can be 
engineered to contain specific uptake sequences thought to favor uptake by specific 
target cells. 
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Gene therapy vectors can be delivered in vivo by administration to an 
individual patient, typically by systemic administration (eg., intravenous, 
intraperitoneal, intramuscular, subdermal, or intracranial infusion) or topical 
application, as described infra. Alternatively, vectors can be delivered to cells ex 
5 vivo, such as cells explanted from an individual patient (eg., lymphocytes, bone 
marrow aspirates, tissue biopsy) or universal donor hematopoietic stem cells, 
followed by reimplantation of the cells into a patient, usually after selection for cells 
which have incorporated the vector. 

Ex vivo cell transfection for diagnostics, research, or for gene therapy (eg,, via 
1 0 re-infusion of the transfected cells into the host organism) is well known to those of 
skill in the art. In a preferred embodiment, cells are isolated from the subject 
organism, transfected with a nucleic acid (gene or cDNA), and re-infused back into 
the subject organism (eg., patient). Various cell types suitable for ex vivo 
transfection are well known to those of skill in the art. See, eg., Freshney et al, 
1 5 Culture of Animal Cells, A Manual of Basic Technique, 3rd ed., 1 994, and references 
cited therein, for a discussion of isolation and culture of cells from patients. 

In one embodiment, hematopoietic stem cells are used in ex vivo procedures 
for cell transfection and gene therapy. The advantage to using stem cells is that they 
can be differentiated into other cell types in vitro, or can be introduced into a mammal 
20 (such as the donor of the cells) where they will engraft in the bone marrow. Methods 
for differentiating CD34+ stem cells in vitro into clinically important immune cell 
types using cytokines such a GM-CSF, IFN-y and TNF-cc are known. Inaba et al. 
(1992) J. Exp. Med. 176:1693-1702. 

Stem cells are isolated for transduction and differentiation using known 
25 methods. For example, stem cells are isolated from bone marrow cells by panning the 
bone marrow cells with antibodies which bind unwanted cells, such as CD4+ and 
CD8+ (T cells), CD45+ (panB cells), GR-1 (granulocytes), and lad (differentiated 
antigen presenting cells). See Inaba et al, supra. 

Vectors (e.g., retroviruses, adenoviruses, liposomes, etc.) containing 
30 therapeutic nucleic acids can be also administered directly to the organism for 
transduction of cells in vivo. Alternatively, naked DNA can be administered. 
Administration is by any of the routes normally used for introducing a molecule into 
ultimate contact with blood or tissue cells. Suitable methods of administering such 
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nucleic acids are available and well known to those of skill in the art, and, although 
more than one route can be used to administer a particular composition, a particular 
route can often provide a more immediate and more effective reaction than another 
route. 

5 Pharmaceutical^ acceptable carriers are determined in part by the particular 

composition being administered, as well as by the particular method used to 
administer the composition. Accordingly, there are a wide variety of suitable 
formulations of pharmaceutical compositions described herein. See, e.g., Remington'* 
Pharmaceutical Sciences, 17th ed., 1989. 

10 

B. Delivery of Polypeptides 

In additional embodiments, fusion proteins are administered directly to target 
cells. In certain in vitro situations, the target cells are cultured in a medium 
containing a fusion protein comprising one or more functional domains fused to one 
15 or more of the modified ZFPs described herein. 

An important factor in the adrainistration of polypeptide compounds is 
ensuring that the polypeptide has the ability to traverse the plasma membrane of a 
cell, or the membrane of an intra-cellular compartment such as the nucleus. Cellular 
membranes are composed of lipid-protein bilayers that are freely permeable to small, 
20 nonionic lipophilic compounds and are inherently impermeable to polar compounds, 
macromolecules, and therapeutic or diagnostic agents. However, proteins, lipids and 
other compounds, which have the ability to translocate polypeptides across a cell 
membrane, have been described. 

For example, "membrane translocation polypeptides" have amphophilic or 
hydrophobic amino acid subsequences that have the ability to act as membrane- 
translocating carriers. In one embodiment, homeodomain proteins have the ability to 
translocate across cell membranes. The shortest internalizable peptide of a 
homeodomain protein, Antennapedia, was found to be the third helix of the protein, 
from amino acid position 43 to 58. Prochiantz (1996) Curr. Opin. Neurobiol 6:629- 
634. Another subsequence, the h (hydrophobic) domain of signal peptides, was found 
to have similar cell membrane translocation characteristics. Lin et al (1995) J, Biol 
Chem. 270:14255-14258. 

Examples of peptide sequences which can be linked to a non-canonical zinc 
finger polypeptide (or fusion containing the same) for facilitating its uptake into cells 

32 



WO 02/057293 



PCTAJS02/01893 



include, but are not limited to: an 1 1 amino acid peptide of the tat protein of HIV; a 
20 residue peptide sequence which corresponds to amino acids 84-103 of the p 16 
protein {see Fahraeus et al (1996) Curr. Biol. 6:84); the third helix of the 60-amino 
acid long homeodomain of Antennapedia (Derossi et al (1994) J. Biol Chem. 
5 269:10444); the h region of a signal peptide, such as the Kaposi fibroblast growth 
factor (K-FGF) h region (Lin et al, supra); and the VP22 translocation domain from 
HSV (Elliot et al (1997) Cell 88:223-233). Other suitable chemical moieties that 
provide enhanced cellular uptake can also be linked, either covalently or non- 
covalently, to the ZFPs. 

10 Toxin molecules also have the ability to transport polypeptides across cell 

membranes. Often, such molecules (called "binary toxins") are composed of at least 
two parts: a translocation or binding domain and a separate toxin domain. Typically, 
the translocation domain, which can optionally be a polypeptide, binds to a cellular 
receptor, facilitating transport of the toxin into the cell. Several bacterial toxins, 

15 including Clostridium perfringens iota toxin, diphtheria toxin (DT), Pseudomonas 
exotoxin A (PE), pertussis toxin (FT), Bacillus anthracis toxin, and pertussis 
adenylate cyclase (CYA), have been used to deliver peptides to the cell cytosol as 
internal or amino-terminal fusions. Arora et al (1993) J. Biol Chem. 268:3334-3341; 
Perelle et al (1993) Infect. Immun. 61:5147-5156; Stenmark et al (1991) J. Cell 

20 Biol 113:1025-1032; Donnelly et al (1993) Proc. Natl Acad Sci. USA 90:3530- 
3534; Carbonetti et al (1995) Abstr. Annu. Meet. Am. Soc. Microbiol 95:295; Sebo 
et al (1995) Infect. Immun. 63:3851-3857; Klimpel et al (1992) Proc. Natl. Acad. 
Sci. USA. 89:10277-10281; and Novak ef al (1992)/. Biol Chem. 267:17186- 
17193. 

25 Such subsequences can be used to translocate polypeptides, including the 

polypeptides as disclosed herein, across a cell membrane. This is accomplished, for 
example, by derivatizing the fusion polypeptide with one of these translocation 
sequences, or by forming an additional fusion of the translocation sequence with the 
fusion polypeptide. Optionally, a linker can be used to link the fusion polypeptide and 

30 the translocation sequence. Any suitable linker can be used, e.g., a peptide linker. 

A suitable polypeptide can also be introduced into an animal cell, preferably a 
mammalian cell, via liposomes and liposome derivatives such as immunoliposomes. 
The term "liposome" refers to vesicles comprised of one or more concentrically 
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ordered lipid bilayers, which encapsulate an aqueous phase. The aqueous phase 
typically contains the compound to be delivered to the cell. 

The liposome fuses with the plasma membrane, thereby releasing the 
compound into the cytosol. Alternatively, the liposome is phagocytosed or taken up 
5 by the cell in a transport vesicle. Once in the endosome or phagosome, the liposome 
is either degraded or it fuses with the membrane of the transport vesicle and releases 
its contents. 

In current methods of drug delivery via liposomes, the liposome ultimately 
becomes permeable and releases the encapsulated compound at the target tissue or 

10 cell. For systemic or tissue specific delivery, this can be accomplished, for example, 
in a passive manner wherein the liposome bilayer is degraded over time through the 
action of various agents in the body. Alternatively, active drug release involves using 
an agent to induce a permeability change in the liposome vesicle. Liposome 
membranes can be constructed so that they become destabilized when the 

15 environment becomes acidic near the liposome membrane. See, e.g., Proc, Natl. 
Acad. Sci. USA 84:7851 (1987); Biochemistry 28:908 (1989). When liposomes are 
endocytosed by a target cell, for example, they become destabilized and release their 
contents. This destabilization is termed fusogenesis. 

Dioleoylphosphatidylethanolamine (DOPE) is the basis of many "fusogenic" systems. 

20 For use with the methods and compositions disclosed herein, liposomes 

typically comprise a fusion polypeptide as disclosed herein, a lipid component, e.g, a 
neutral and/or cationic lipid, and optionally include a receptor-recognition molecule 
such as an antibody that binds to a predetermined cell surface receptor or ligand (e.g., 
an antigen). A variety of methods are available for preparing liposomes as described 

25 in, eg.; U.S. Patent Nos. 4,186,183; 4,217,344; 4,235,871; 4,261,975; 4,485,054; 
4,501,728; 4,774,085; 4,837,028; 4,235,871; 4,261,975; 4,485,054; 4,501,728; 
4,774,085; 4,837,028; 4,946,787; PCT Publication No. WO 91/17424; Szokae/a/. 
(1980) Ann. Rev. Biophys. Bioeng. 9:467; Deamer et al. (1976) Biochim. Biophys. 
Acta 443:629-634; Fraley, et al. (1979) Proc. Natl. Acad. Sci. USA 76:3348-3352; 

30 Hope et al. (1985) Biochim. Biophys. Acta 812:55-65; Mayer et al. (1 986) Biochim. 
Biophys. Acta 858:161-168; Williams et al. (1988) Proc. Natl. Acad. Sci. USA 
85:242-246; Liposomes, Ostro (ed.), 1983, Chapter 1); Hope et al (1986) Chem. 
Phys. Lip. 40:89; Gregoriadis, Liposome Technology (1984) and Lasic, Liposomes: 
from Physics to Applications (1993). Suitable methods include, for example, 
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sonication, extrusion, high pressure/homogenizatioii, microfluidization, detergent 
dialysis, calcium-induced fusion of small liposome vesicles and ether-fusion methods, 
all of which are well known in the art. 

In certain embodiments, it may be desirable to target a liposome using 
5 targeting moieties that are specific to a particular cell type, tissue, and the like. 

Targeting of liposomes using a variety of targeting moieties (e.g., ligands, receptors, 
and monoclonal antibodies) has been previously described. See, e.g., U.S. Patent 
Nos. 4,957,773 and 4,603,044. 

Examples of targeting moieties include monoclonal antibodies specific to antigens 

10 associated with neoplasms, such as prostate cancer specific antigen and MAGE. Tumors 
can also be diagnosed by detecting gene products resulting from the activation or over- 
expression of oncogenes, such as ras or c-erbB2. In addition, many tumors express 
antigens normally expressed by fetal tissue, such as the alphafetoprotein (AFP) and 
carcinoembryonic antigen (CEA). Sites of viral infection can be diagnosed using various 

15 viral antigens such as hepatitis B core and surface antigens (HBVc, HBVs) hepatitis C 
antigens, Epstein-Barr virus antigens, human immunodeficiency type-1 virus (HTV-l) and 
papilloma virus antigens. Inflammation can be detected using molecules specifically 
recognized by surface molecules which are expressed at sites of inflammation such as 
integrins (e.g., VCAM-1), selectin receptors (e.g., ELAM-1) and the like. 

20 Standard methods for coupling targeting agents to liposomes are used. These 

methods generally involve the incorporation into liposomes of lipid components, e.g., 
phosphatidylemanolamine, which can be activated for attachment of targeting agents, 
or incorporation of derivatized lipophilic compounds, such as lipid derivatized 
bleomycin. Antibody targeted liposomes can be constructed using, for instance, 

25 liposomes which incorporate protein A. See Renneisen et al. (1990) J. Biol. Chem. 
265:16337-16342 andLeonetti etal. (1990)Proc. Natl. Acad. Sci. USA 87:2448- 
2451. 
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Pharmaceutical compositions and administration 

The modified zinc finger proteins and fusion molecules as disclosed herein, 
and expression vectors encoding these polypeptides, can be used in conjunction with 
various methods of gene therapy to facilitate the action of a therapeutic gene product. 
In such applications, the ZFP-containing compositions can be adniinistered directly to 
a patient, e.g., to facilitate the modulation of gene expression and for therapeutic 
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prophylactic applications, for example, cancer (including tumors associated with 
Wilms' third tumor gene), ischemia, diabetic retinopathy, macular degeneration, 
rheumatoid arthritis, psoriasis, HTV infection, sickle cell anemia, Alzheimer's disease, 
muscular dystrophy, neurodegenerative diseases, vascular disease, cystic fibrosis, 
5 stroke, and the like. Examples of microorganisms whose inhibition can be facilitated 
through use of the methods and compositions disclosed herein include pathogenic 
bacteria, e.g., Chlamydia, Rickettsial bacteria, Mycobacteria, Staphylococci, 
Streptococci, Pneumococci, Meningococci and Conococci, Klebsiella, Proteus, 
Serratia, Pseudomonas, Legionella, Diphtheria, Salmonella, Bacilli {e.g., anthrax), 
10 Vibrio (e.g., cholera), Clostridium (e.g., tetanus, botulism), Yersinia (e.g., plague), 
Leptospirosis, and Borrellia (e.g., Lyme disease bacteria); infectious fungus, e.g., 
Aspergillus, Candida species; protozoa such as sporozoa (e.g., Plasmodia), rhizopods 
(e.g. Entamoeba) and flagellates (Trypanosoma, Leiskmania, Trichomonas, Giardia, 
efc.);viruses, e.g., hepatitis (A, B, or C), herpes viruses (e.g, VZV, HSV-1, HHV-6, 
15 HSV-H, CMV, and EBV), HIV, Ebola, Marburg and related hemorrhagic fever- 
causing viruses, adenoviruses, influenza viruses, flavi viruses, echoviruses, 
rhinoviruses, coxsackie viruses, cornaviruses, respiratory syncytial viruses, mumps 
viruses, rotaviruses, measles viruses, rubella viruses, parvoviruses, vaccinia viruses, 
HTLV viruses, retroviruses, lentiviruses, dengue viruses, papillomaviruses, 
20 polioviruses, rabies viruses, and arboviral encephalitis viruses, etc. 

Administration of therapeutically effective amounts of modified ZFPs 
described herein, fusion molecules including these ZFPs, or nucleic acids encoding 
these polypeptides, is by any of the routes normally used for introducing polypeptides 
or nucleic acids into ultimate contact with the tissue to be treated. The polypeptides 
25 or nucleic acids are administered in any suitable manner, preferably with 

pharmaceutically acceptable carriers. Suitable methods of administering such 
modulators are available and well known to those of skill in the art, and, although 
more than one route can be used to administer a particular composition, a particular 
route can often provide a more immediate and more effective reaction than another 
30 route. 

PharmaceuticaUy acceptable carriers are determined in part by the particular 
composition being administered, as well as by the particular method used to 
administer the composition. Accordingly, there are a wide variety of suitable 
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formulations of pharmaceutical compositions. See, e,g, Remington's Pharmaceutical 
Sciences, 17 th ed. 1985. 

ZFPs and ZFP fusion polypeptides or nucleic acids, alone or in combination 
with other suitable components, can be made into aerosol formulations (ie., they can 
5 be "nebulized") to be administered via inhalation. Aerosol formulations can be 
placed into pressurized acceptable propellants, such as dichlorodifluoromethane, 
propane, nitrogen, and the like. 

Formulations suitable for parenteral administration, such as, for example, by 
intravenous, intramuscular, intradermal, and subcutaneous routes, include aqueous 
10 and non-aqueous, isotonic sterile injection solutions, which can contain antioxidants, 
buffers, bacteriostats, and solutes that render the formulation isotonic with the blood 
of the intended recipient, and aqueous and non-aqueous sterile suspensions that can 
include suspending agents, solubilizers, thickening agents, stabilizers, and 
preservatives. Compositions can be administered, for example, by intravenous 
15 infusion, orally, topically, intraperitoneally, intravesically or intrathecal!^ The 
formulations of compounds can be presented in unit-dose or multi-dose sealed 
containers, such as ampoules and vials. Injection solutions and suspensions can be 
prepared from sterile powders, granules, and tablets of the kind known to those of 
skill in the art. 

20 

Applications 

The compositions and methods disclosed herein can be used to facilitate a 
number of processes involving transcriptional regulation. These processes include, 
but are not limited to, transcription, replication, recombination, repair, integration, 
maintenance of telomeres, processes involved in chromosome stability and 
disjunction, and maintenance and propagation of chromatin structures. Accordingly, 
the methods and compositions disclosed herein can be used to affect any of these 
processes, as well as any other process that can be influenced by ZFPs or ZFP fusions. 
In preferred embodiments, one or more of the molecules described herein are 
30 used to achieve targeted activation or repression of gene expression, e,g t based upon 
the specificity of the modified ZFP. In another embodiment, one or more of the 
molecules described herein are used to achieve reactivation of a gene, for example a 
developmental^ silenced gene; or to achieve sustained activation of a transgene. The 
modified ZFP can be targeted to a region outside of the coding region of the gene of 
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interest and, in certain embodiments, is targeted to a region outside the regulatory 
region(s) of the gene. In these embodiments, additional molecules, exogenous and/or 
endogenous, can be used to facilitate repression or activation of gene expression. The 
additional molecules can also be fusion molecules, for example, fusions between a 
5 ZFP and a functional domain such as an activation or repression domain. See, for 
example, co-owned WO 00/41566. 

Accordingly, expression of any gene in any organism can be modulated using 
the methods and compositions disclosed herein, including therapeutically relevant 
genes, genes of infecting microorganisms, viral genes, and genes whose expression is 
10 modulated in the processes of drug discovery and/or target validation. Such genes 
include, but are not limited to, Wilms' third tumor gene (WT3), vascular endothelial 
growth factors (VEGFs), VEGF receptors (e.g.,flt and Jlk) CCR-5, low density 
lipoprotein receptor (LDLR), estrogen receptor, HER-2/neu, BRCA-1, BRCA-2, 
phosphoenolpyruvate carboxykinase (PEPCK), CYP7, fibrinogen, apolipoprotein A 
15 (ApoA), apolipoprotein B (ApoB), renin, phosphoenolpyruvate carboxykinase 

(PEPCK), CYP7, fibrinogen, nuclear factor tcB (NF-kB), inhibitor of NF-kB (I-kB), 
tumor necrosis factors (e.g., TNF-a, TNF-P), interleukin-1 (IL-1), FAS (CD95), FAS 
ligand (CD95L), atrial natriuretic factor, platelet-derived factor (PDF), amyloid 
precursor protein (APP), tyrosinase, tyrosine hydroxylase, p-aspartyl hydroxylase, 
20 alkaline phosphatase, calpains (e.g., CAPN10) neuronal pentraxin receptor, 

adriamycin response protein, apolipoprotein E (apoE), leptin, leptin receptor, UCP-1, 
IL-1, IL-1 receptor, IL-2, IL-3, IL-4, IL-5, IL-6, IL-12, IL-15, interleukin receptors, ' 
G-CSF, GM-CSF, colony stimulating factor, erythropoietin (EPO), platelet-derived 
growth factor (PDGF), PDGF receptor, fibroblast growth factor (FGF), FGF receptor, 
PAF, pl6, pl9, p53, Rb, p21, myc, myb, globin, dystrophin, eutrophin, cystic fibrosis 
transmembrane conductance regulator (CFTR), GNDF, nerve growth factor (NGF), 
NGF receptor, epidermal growth factor (EGF), EGF receptor, transforming growth 
factors (e.g., TGF-a, TGF-P), fibroblast growth factor (FGF), interferons (e.g., 
IFN- a, IFN- p and IFN-y), insulin-related growth factor-1 (IGF-1), angiostatin, 
ICAM-1, signal transducer and activator of transcription (STAT), androgen receptors, 
e-cadherin, cathepsins (e.g., cathepsin W), topoisomerase, telomerase, bcl, bcl-2, Box, 
T Cell-specific tyrosine kinase (Lck), p38 mitogen-activated protein kinase, protein 
tyrosine phosphatase (bPTP), adenylate cyclase, guanylate cyclase, a7 neuronal 
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nicotinic acetylcholine receptor, 5-hydroxytryptamine (serotonin)-2A receptor, 
transcription elongation factor-3 (TEF-3), phosphatidylcholine transferase,/*?, PTI-1, 
polygalacturonase, EPSP synthase, FAD2-1, A-9 desaturase, A-12 desaturase, A-15 
desaturase, acetyl-Coenzyme A carboxylase, acyl-ACP thioesterase, ADP-ghicose 
pyrophosphorylase, starch synthase, cellulose synthase, sucrose synthase, fatty acid 
hydroperoxide lyase, and peroxisome proliferator-activated receptors, such as PPAR- 
Y2. 

Expression of human, mammalian, bacterial, fungal, protozoal, Archaeal, plant 
and viral genes can be modulated; viral genes include, but are not limited to, hepatitis 
virus genes such as, for example, HBV-C, HBV-S, HBV-X and HBV-P; and HTV 
genes such as, for example, tat and rev. Modulation of expression of genes encoding 
antigens of a pathogenic organism can be achieved using the disclosed methods and 
compositions. 

Additional genes include those encoding cytokines, lympholrines, interleukins, 
growth factors, mitogenic factors, apoptotic factors, cytochromes, chemotactic factors, 
chemokine receptors (eg., CCR-2, CCR-3, CCR-5, CXCR-4), phospholipases (e.g., 
phospholipase C), nuclear receptors, retinoid receptors, organellar receptors, hormones, 
hormone receptors, oncogenes, tumor suppressors, cyclins, cell cycle checkpoint proteins 
(e.g.,Chkl, Chk2), senescence-associated genes, immunoglobulins, genes encoding heavy 
20 metal chelators, protein tyrosine kinases, protein tyrosine phosphatases, tumor necrosis 
factor receptor-associated factors (e.g., Traf-3, Traf-6), apolipoproteins, thrombic factors, 
vasoactive factors, neuroreceptors, cell surface receptors, G-proteins, G-protein-coupled 
receptors (e.g., substance K receptor, angiotensin receptor, a- and p-adrenergic receptors, 
serotonin receptors, and PAF receptor), muscarinic receptors, acetylcholine receptors, 
GABA receptors, glutamate receptors, dopamine receptors, adhesion proteins (e.g., CAMs, 
selecting integrins and immunoglobulin superfamily members), ion channels, receptor- 
associated factors, hematopoietic factors, transcription factors, and molecules involved in 
signal transduction. Expression of disease-related genes, and/or of one or more genes 
specific to a particular tissue or cell type such as, for example, brain, muscle, heart, 
nervous system, circulatory system, reproductive system, genitourinary system, digestive 
system and respiratory system can also be modulated. 

Other applications include therapeutic methods in which a modified ZFP, a ZFP 
fusion polypeptide, or a nucleic acid encoding a modified ZFP or a ZFP fusion is 
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administered to a subject and used to modulate the expression of a target gene within the 
subject (as disclosed, for example, in co-owned PCT WO 00/41566). The modulation can 
be in the form of repression, for example, when the target gene resides in a pathological 
infecting microorganism, or in an endogenous gene of the patient, such as an oncogene or 
5 viral receptor, that is contributing to a disease state. Alternatively, the modulation can be 
in the form of activation, when activation of expression or increased expression of an 
endogenous cellular gene (such as, for example, a tumor suppressor gene) can ameliorate a 
disease state. Exemplary ZFP fusion polypeptides for both activation and repression of 
gene expression are disclosed supra. For such applications, modified ZFPs, ZFP fusion 
10 polypeptides or, more typically, nucleic acids encoding them are formulated with a 
pharmaceutical^ acceptable carrier as a pharmaceutical composition. 

Pharmaceutically acceptable carriers and excipients are determined in part by the 
particular composition being administered, as well as by the particular method used to 
administer the composition. See, for example, Remington 's Pharmaceutical Sciences, 17 th 
15 ed., 1985. ZFPs, ZFP fusion polypeptides, or polynucleotides encoding ZFP fusion 

polypeptides, alone or in combination with other suitable components, can be made into 
aerosol formulations (i.e., they can be "nebulized") to be administered via inhalation. 
Aerosol formulations can be placed into pressurized acceptable propellants, such as 
dichlorodifluoromethane, propane, nitrogen, and the like. Formulations suitable for 

20 parenteral administration, such as, for example, by intravenous, intramuscular, 

intradermal, and subcutaneous routes, include aqueous and non-aqueous, isotonic sterile 
injection solutions, which can contain antioxidants, buffers, bacteriostats, and solutes that 
render the formulation isotonic with the blood of the intended recipient, and aqueous and 
non-aqueous sterile suspensions that can include suspending agents, solubilizers, 

25 thickening agents, stabilizers, and preservatives. Compositions can be administered, for 
example, by intravenous infusion, orally, topically, intraperitoneally, intravesically or 
intrathecally. The formulations of compounds can be presented in unit-dose or multi-dose 
sealed containers, such as ampoules and vials. Injection solutions and suspensions can be 
prepared from sterile powders, granules, and tablets of the kind previously described. 

30 The dose administered to a patient should be sufficient to affect a beneficial 

therapeutic response in the patient over time. The dose is determined by the efficacy and 
binding affinity (K<j) of the particular ZFP employed, the target cell, and the condition of 
the patient, as well as the body weight or surface area of the patient to be treated. The size 
of the dose also is determined by the existence, nature, and extent of any adverse side 
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effects that accompany the administration of a particular compound or vector in a 
particular patient 

In other applications, modified ZFPs and other DNA- and/or RNA-binding proteins 
are used in diagnostic methods for sequence-specific detection of target nucleic acid in a 
5 sample. For example, modified ZFPs can be used to detect variant alleles associated with 
a disease or phenotype in patient samples. As an example, modified ZFPs can be used to 
detect the presence of particular mRNA species or cDNA in a complex mixture ofmRNAs 
or cDNAs. As a further example, modified ZFPs can be used to'quantify the copy number 
of a gene in a sample. For example, detection of loss of one copy of a P 53 gene in a 

10 clinical sample is an indicator of susceptibility to cancer. In a further example, modified 
ZFPs are used to detect the presence of pathological microorganisms in clinical samples. 
This is achieved by using one or more modified ZFPs, as disclosed herein, that bind a 
target sequence in one or more genes within the microorganism to be detected. A suitable 
format for performing diagnostic assays employs modified ZFPs linked to a domain that 

15 allows immobilization of the ZFP on a solid support such as, for example, a microliter 
plate or an ELBA plate. The immobilized ZFP is contacted with a sample suspected of 
containing a target nucleic acid under conditions in which binding between the modified 
ZFP and its target sequence can occur. Typically, nucleic acids in the sample are labeled 
(e.g., in the course of PCR amplification). Alternatively, unlabelled nucleic acids can be 
20 detected using a second labeled probe nucleic acid. After washing, bound, labeled nucleic 
acids are detected. Labeling can be direct (U, the probe binds directly to the target 
nucleic acid) or indirect {i.e., probe binds to one or more molecules which themselves bind 
to the target). Labels can be, for example, radioactive, fluorescent, chemiluminescent 
and/or enzymatic. 

25 Modified ZFPs, as disclosed herein, can also be used in assays that link phenotype 

to the expression of particular genes. Current methodologies for determination of gene 
function rely primarily upon either over-expressing a gene of interest or removing a gene 
of interest from its natural biological setting, and observing the effects. The phenotypic 
effects resulting from over-expression or knockout are then interpreted as an indication of 

30 the role of the gene in the biological system. An exemplary animal model system for 

performing these types of analysis is the mouse. A transgenic mouse generally contains an 
introduced gene or has been genetically modified so as to up-regulate an endogenous gene. 
Alternatively, in a "knock-out" mouse, an endogenous gene has been deleted or its 
expression has been ablated. There are several problems with these existing systems, 
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many of which are related to the fact that it is only possible to achieve "all-or-none" 
modulation of gene expression in these systems. The first is the limited ability to 
modulate expression of the gene under study (e.g., in knock-out mice, the gene under 
study is generally either absent from the genome or totally non-functional; while in 
5 transgenic mice which overexpress a particular gene, there is generally a single level of 
overexpression). The second is the oft-encountered requirement for certain genes at 
multiple stages of development. Thus, it is not possible to determine the adult function of 
a particular gene, whose activity is also required during embryonic development, by 
generating a knock-out of that gene, since the animals containing the knock-out will not 
10 survive to adulthood. 

One advantage of using ZFP-mediated regulation of a gene to determine its 
function, relative to the aforementioned conventional knockout analysis, is that expression 
of a ZFP can be placed under small molecule control. See, for example, U.S. Patent 
No. 5,654,168; 5,789,156; 5,814,618; 5,888,981; 6,004,941; 6,087,166; 6,136,954; 

15 and co-owned WO 00/41566. By controlling expression levels of the ZFPs, one can in 
turn control the expression levels of a gene regulated by the ZFP to determine what degree 
of repression or stimulation of expression is required to achieve a given phenotypic or 
biochemical effect. This approach has particular value for drug development In addition, 
placing ZFP expression under small molecule control allows one to surmount the 

20 aforementioned problems of embryonic lethality and developmental compensation, by 
switching on expression of the ZFP at a later stage in development and observing the 
effects in the adult animal. 

Transgenic mice having target genes regulated by a modified ZFP or a ZFP fusion 
protein can be produced by integration of the nucleic acid encoding the modified ZFP or 

25 ZFP fusion at any site in trans to the target gene. Accordingly, homologous 

recombination is not required for integration of the ZFP-encoding nucleic acid. Further, 
because the transcriptional regulatory activity of a modified ZFP or ZFP fusion is trans- 
dominant, one is only required to obtain animals having one chromosomal copy of a ZFP- 
encoding nucleic acid. Therefore, functional knock-out animals can be produced without 

30 backcrossing. 

The following examples are presented as illustrative of, but not limiting, the 
claimed subject matter. 
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EXAMPLES 

Example 1. Production of non-canonical zinc finger binding proteins 

Synthetic genes encoding non-canonical zinc finger binding proteins are obtained 
following the procedure outlined in co-owned PCT WO 00/42219, with the exception that 
the oligonucleotide encoding the recognition helix to be modified includes a 
polynucleotide sequence that specifies the modified amino acid sequence. For example, 
for modification of finger 3 (the C-terminal-most finger of a three-finger ZFP), the 
sequence of oligonucleotide 6 is designed to encode the modified zinc coordination 
residues). 

Example 2. Modulation of expression of the LCK gene with Non-Canonical 

ZFP 

In this experiment, the designed zinc finger protein <t PTP2", which recognizes the 
target sequence GAGGGGGCG and regulates expression of the LCK gene, was modified 
1 5 via substitution of the 2 nd histidine in its third finger with cysteine (to yield the protein 
"PTP2(H->C)'\ Two flanking residues were also changed to glycine to enhance the 
potential of the introduced cysteine to productively coordinate zinc. The sequences of the 
resultant zinc finger proteins were as follows: 

20 PTP2: 

Fl PGKKKQHICHIQGCGKVYGRSDELTRBpLRWHTGER (SEQ ID NO- 112) 
F2 PFMCTWSYCGKJEIFTRSDHLTRHKRTHTGEK (SEQ IDN0113) 

F3 KFACPE — -CTKRFMRSDNLTRHIKTHQISOKXGGS (SEQ ID 

NO: 114) 

25 

PTP2(H->C): 

Fl PGKKKQm^QGCGKVYGRSDELTRHLRWHTGER (SEQ ID NO: 115) 
F2 PFMCTWSYCGKRFTRSDHLTOHKRTHTGEK (SEQ IDN0116) 

F3 KFACPE — CPKRFMRSDNLTRffiGGCQNKKGGS (SEQ ID 

30 NO:117) 

Bold and underlines highlight zinc-coordinating residues, and italics highlights 
positions changed in converting PTP2 into PTP2 (H -> C). 

35 Both ZFPs were expressed in 293 cells as fusions with a nuclear localization signal 

(NLS), VP16 activation domain, and a FLAG tag. The structure (e.g. t order) of the fusion 
proteins were as follows: 

~NLS | ZFP | VP16 I FLAG 
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After expression of each protein in 293 cells, cellular levels of the LCK mRNA 
were determined relative to the level of a control RNA (1 8S RNA) using a PCR based 
5 'Taqman" assay. RNA levels were also determined for a control protein (NVF) lacking 
any ZFP (and containing only the NLS, VP16 and FLAG regions). Each experiment was 
performed in duplicate, and the measured RNA ratios are shown in Figure 1 . These ratios 
indicate that the PTP2 ZFP activates expression of the LCK gene, and that the PTP(H->C) 
ZFP activates LCK to even higher levels. These results illustrate the potential of 
10 substitutions at zinc-coordinating positions to provide ZFPs with enhanced cellular 
taction. As illustrated in Figure 1, modification of zinc-coordinating positions « 
enhance the cellular activity of designed zinc finger protein transcription factors. 
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Example 3. Modulation of expression of a human VEGF gene with modified 

15 ZFPs 

This example describes the modification of two VEGF-regulating ZFPs. For each 
of the two ZFPs, a number of non-canonical modified ZFPs were constructed. The 
proteins were then tested for their ability to regulate VEGF expression and compared with 
the two C2H2 parental proteins. 

Zinc finger proteins comprising a series of C 2 H 2 zinc fingers, and designed to bind 
to the human VEGF-A gene and regulate its expression, have been described. Liu et al. 
(2001) J. Biol. Chen. 276: 1 1,323-1 1,334. Two of these ZFPs (named VOP30A and 
VOP32B), each containing fcree zinc fingers, were converted to non-canonical ZFPs. 
VOP30A corresponds to VZ+42/+530 and VOP32B corresponds to VZ+434a in the Liu et 
al. reference. This was accomplished by modifying the third finger of each protein. Seven 
non-canonical versions of each protein were made, each comprising a different non- 
canonical C2HC third finger. Amino acid sequences of portions of the canonical parent 
ZFPs and each of the non-canonical ZFPs, beginning at histidine +7 (with respect to the 
start of the alpha-helix) of the third finger, are shown in Table 1. 



30 
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Table 1 



10 



15 



20 



25 



NAME 


SEQUENCE 




C2H2 


HI KTHQNKKGGS 


ii 


S 


HSETGCTKKGGS 


12 


E 


HLKSLTPCTGGS 


13 


K 


HKCGIQNKKGGS 


14 


CT 


HSENCQGKKGGS 


15 


C 


HIKTCQNKKGGS 


16 


GC 


HIKGCQNKKGGS 


17 


GGC 


HIGGCQNKKGGS 


18 



.. sequences begin at +7 of the alpha helix of the third zinc finger 

2. residues involved in metal coordination are bolded and underlined 

3. the first row (protein designated C2H2) shows the sequence of the parental ZFPs 

Human embryonic kidney cells (HEK 293) were transfected with nucleic acids 
encoding non-canonical derivatives of the VOP30A and VOP32B fusion proteins, as well 
as the parent (canonical) fusion proteins. The fusion proteins also comprised a VP16 
transcriptional activation domain, a nuclear localization sequence and an epitope tag. 

The cells were grown in DMEM (Dulbecco's modified Eagle's medium), 
supplemented with 10% fetal bovine serum, in a 5% C0 2 incubator at 37°C. Cells were 
plated in 24-well plates at a density of 160,000 cells per well. A day later, when the cells 
were at approximately 70% confluence, plasmids encoding ZFP-VP16 fusions were 
introduced into the cells using LipofectAMINE 2000™ reagent (Gibco Life Technologies, 
Rockville, MD) according to the manufacturer's recommendations, using 2 ul 
LipofectAMINE 2000™ and 1 ug plasmid DNA per well. Medium was removed and 
replaced with fresh medium 16 hours after transfection. Forty hours after transfection, the 
culture medium was harvested and assayed for VEGF-A expression. VEGF-A protein 
content in the culture medium was assayed using a human VEGF ELISA kit (Quanti-Glo, 
R&D Systems, Minneapolis, MN) according to the manufacturer's instructions. 

The results, shown in Figure 2, indicate that C2HC derivatives of both VOP 30A 
and VOP 32B activate VEGF expression and are thus useful as targeted exogenous 
regulatory molecules. 

Example 4. Production of modified plant zinc finger binding proteins 

This example describes a strategy to select amino acid sequences for plant zinc 
finger backbones from among existing plant zinc finger sequences, and subsequent 
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conceptual modification of the selected plant zinc finger amino acid sequences to optimize 
their DNA binding ability. Oligonucleotides used in the preparation of polynucleotides 
encoding proteins containing these zinc fingers in tandem array are then described. 

5 A. Selection of plant zinc finger backbones 

A search was conducted for plant zinc fingers whose backbone sequences (Le., the 
portion of the zinc finger outside of the -1 through +6 portion of the recognition helix) 
resembled that of the SP-1 consensus sequence described by Berg (1992) Proa Natl Acad. 
Scl USA 89:1 1,109-1 1,110. The sequences selected included the two conserved cysteine 

10 residues, a conserved basic residue (lysine or arginine) located two residues to the C- 
tenninal side of the second (i.e. C-terminal) cysteine, a conserved phenylalanine residue 
located two residues to the C-terminal side of the basic residue, the two conserved 
histidine residues, and a conserved arginine residue located two residues to the C-terminal 
side of the first (/.e, N-terminal) conserved histidine. The amino acid sequences of these 

15 selected plant zinc finger backbones (compared to the SP-1 consensus sequence) are 

shown below, with conserved residues shown in bold and X referring to residues located at 
positions -1 through +6 in the recognition helix (which will differ among different 
proteins depending upon the target sequence): 

20 SP-1 consensus: YKCPECGKSFSXXXXXXXHQRTHTGEKP (SEQ ID NO: 19) 

Fl : KKKSKGHECPTr?TPRVPTfYYYYYYYWTrDCixrr»riT?i 



(SEQ 


ID 


NO 


:19) 


(SEQ 


ID 


NO 


:20) 


(SEQ 


ID 


NO 


:21) 


(SEQ 


ID 


NO 


:22) 



F2 YKCTVCGKSFSXXXXXXXHKRLHTGE1 
F3 FSCNYCQRKFYXXXXXXXHVRIH 

-5 -1 5 

25 The first finger (Fl) was chosen because it contained a basic sequence N-terminal to 

the finger that is also found adjacent to the first finger of SP-1. The finger denoted Fl is a 
Petunia sequence, the F2 and F3 fingers are Arabidopsis sequences. 

B. Modification o f plant zinc finger backbones 
30 Two of the three plant zinc fingers (Fl and F3, above) were modified so that their 

amino acid sequences more closely resembled the sequence of SP-1, as follows. (Note that 
the sequence of SP-1 is different from the sequence denoted "SP-1 consensus.") In F3, the Y 
residue at position -2 was converted to a G, and the sequence QNKK (SEQ ID NO:23) was 
added to the C-tenninus of F3. Hie QNKK (SEQ ID NO:23) sequence is present C-terminal 
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to the third finger of SP-1, and permits greater flexibility of that finger, compared to fingers 1 
and 2, which are flanked by the helix-capping sequence T G E K/R K/P (SEQ ID NO:24). 
Such flexibility can be particularly beneficial when the third finger is modified to contain a 
non-C 2 H 2 backbone, as described herein. Finally, several amino acids were removed from 
5 the N-terminus of Fl. The resulting zinc finger backbones had the following sequences: 

KSKGHECPICfWiTOCXXXXX^ (SEQ ID NO:25) 

YKCTV CGKS FS XXXXXXXHKRLHTGEKP (SEQ ID NO: 26) 
FSCNYC QRKFG XXXXXXXHVRIHONKK (SEQ ID NO: 27) 

10 

Amino acid residues denoted by X, present in the recognition portion of these zinc 
fingers, are designed or selected depending upon the desired target site, according to methods 
disclosed, for example, in co-owned WO 00/41566 and WO 00/42219, and/or references 
cited supra. 

15 

C. Nucleic acid sequences encoding backbones for modified plant ZFPs 
The following polynucleotide sequences were used for design of three-finger plant 
ZFPs that contain the Fl, F2 and F3 backbones described above. Polynucleotides encoding 
multi-finger ZFPs were designed according to an overlapping oligonucleotide method as 

20 described in, for example, co-owned WO 00/41566 and WO 00/42219. Oligonucleotides HI, 
H2 and H3 (below) comprise sequences corresponding to the reverse complement of the 
recognition helices of fingers 1-3 respectively; accordingly, nucleotides denoted by N vary 
depending upon the desired amino acid sequences of the recognition helices, which, in turn, 
depend upon the nucleotide sequence of the target site. Oligonucleotides PB1, PB2 and PB3 

25 encode the beta-sheet portions of the zinc fingers, which are common to all constructs. 
Codons used frequently in Arabidopsis and E. coli were selected for use in these 
oligonucleotides. 



HI: 

30 5'-CTC ACC GGT GTG AGA ACG CTT GTG NNN NNN NNN NNN NNN NNN NNN 
CTT GAA AAC ACG GAA-3 ' 
(SEQH>NO:28) 
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H2: 

5'-TTC ACC AGT ATG AAG ACG CTT ATG NNN NNN NNN NNN NNN NNN NNN 
AGA AAA AGA CTT ACC-3 ' 
(SEQ ID NO:29) 

5 

H3: 

5'-CTT CTT GTT CTG GTG GAT ACG CAC GTG NNN NNN NNN-NNN NNN NNN 

NNN ACC GAA CTT ACG CTG-3' 

(SEQIDNO:30) 

10 

PB1: 

5 '-AAGTCTAAGGGTCACGAGTGCCCAATCTGCTrCCGTGTTTTCAAG-3 ' 
(SEQIDNO:31) 

15 PB2: 
5'- 

TCrCACACCGGTGAGAAGCCATACAAGTGCACTGTTTGTGGTAAGTCTTTTTCT-3' 
(SEQIDNO:32) 

20 PB3: 
5'- 

CTTCATACrGGTGAAAAGCCATTCTCTTGCAACTACTGCCAGCGTAAGTTCGGT-3' 
(SEQIDNO:33) 



25 Briefly, these six oligonucleotides are annealed and amplified by polymerase chain 

reaction. The initial amplification product is reamplified using primers that are 
complementary to the initial amplification product and that also contain 5' extensions 
containing restriction enzyme recognition sites, to facilitate cloning. The second 
amplification product is inserted into a vector containing, for example, one or more 

30 functional domains, nuclear localization sequences, and/or epitope tags. See, for example, 
co-owned WO 00/41566 and WO 00/42219. 



Example 5. Construction of a polynucleotide encoding a modified plant zinc 
finger protein for binding to a predetermined target sequence 

A modified plant zinc finger protein was designed to recognize the target sequence 
5 '-G AGGGGGCG-3 ' . Recognition helix sequences for Fl, F2 and F3 were determined, a: 
shown in Table 2, and oligonucleotides corresponding to HI, H2 and H3 above, also 
including sequences encoding these recognition helices, were used for PCR assembly as 
described above. 
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Table 2 



Finger 


Target 


Helix sequence 


Nucleotide sequence for PCR assembly 


Fl 


GCG 


RSDELTR 
SEQIDNO:109 


5 * CTCACCGGTGTGAGAACGCTTGTG ACGGGTC AACT 
CGTCAGAACGCTTGAAAACACGGAA-3 * (SEQ ID N034) 


F2 


GGG 


RSDHLTR 
SEQIDNOrllO 


5 *TTCACCAGTATGAAGACGCTTATGACGGGTCAAGT 
GGTCAGAACGAGAAAAAGACTTACC-3 ' (SEQ ID N0 35) 


F3 


GAG 


RSDNLTR 
SEQIDNO:lll 


5 'CTTCTTGTTCTGGTGGATACGCACGTGACGGGTCA 
AGTTGTCAGAACGACCGAACTTACGCTG-3 ' (SEO IDN0361 



10 



15 



Subsequent to the initial amplification, a secondary amplification was conducted, as 
described above, using the following primers: 

PZF: 5 '-CGGGGTACC AGGTAAGTCTAAGGGTCAC (SEQ ID NO:37) 

PZR: 5'-GCGCGGATCGACCCTTCTTGTTCTGGTGGATACG . (SEQ ID NO:38). 

PZF includes a Kpnl site (underlined) and overlaps the PB1 sequence (overlap 
indicated in bold). PZR includes a Bamffl (underlined) site and overlaps with H3 (indicated 
in bold). 

The secondary amplification product is digested with Kpn I and Bam HI and inserted 
into an appropriate vector (e.g., YCF3, whose construction is described below) to construct 
an expression vector encoding a modified plant ZFP fused to a functional domain, for 
modulation of gene expression in plant cells. 



20 



25 



Example 6. Construction of Vectors for Expression of Modified Plant ZFPs 
YCF3 was generated as shown in Figure 3. The starting construct was a plasmid 
containing a CMV promoter, a SV40 nuclear localization sequence (NLS), a ZFP DNA 
binding domain, a Herpesvirus VP16 transcriptional activation domain and a FLAG 
epitope tag (pSB5186-NVF). This construct was digested with Spel to remove the CMV 
promoter. The larger fragment was gel-purified and self-ligated to make a plasmid termed 
GF1. GF1 was then digested wim Kpnl and Hindm, releasing sequences encoding the 
ZFP domain, the VP16 activation domain, and the FLAG epitope tag, then the larger 
fragment was ligated to a Kpnl/Hindm fragment containing sequences encoding a ZFP 
binding domain and a VP16 activation domain, named GF2. This resulted in deletion of 
sequences encoding the FLAG tag from the construct. 

GF2 was digested with BamHI and Hindm, releasing a -small fragment encoding 
the VP16 activation domain, and the larger fragment was purified and ligated to a 
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BamHI/Hindin digested PCR fragment containing the maize CI activation domain (Goff 
et al (1990) EMBO 1 9:2517-2522) (Kpnl and HindHI sites were introduced into the PCR 
fragment through Kpnl and HindTTT site-containing primers) to generate NCF1 . A PCR 
fragment containing a Maize Opaque-2 NLS was digested with Spel/Kpnl and ligated to 
5 the larger fragment from KpnI/Spel digested NCF1 to produce YCF2. YCF2 was then 
digested with Mlul and Spel and the larger fragment was ligated to an Mlul and Spel 
digested PCR fragment containing the plant-derived CaMV 35S promoter (Mlul and Spel 
sites were introduced into the PCR fragment through Mlul or Spel site containing primers) 
to generate the YCF3 vector, 

10 Sequences encoding modified plant ZFP binding domains can be inserted, as 

KpnI/BamHI fragments, into KpnI/BamHI-digested YCF3 to generate constructs encoding 
ZFP-functional domain fusion proteins for modulation of gene expression in plant cells. 
For example, a series of modified plant ZFP domains, described in Example 5 infra, were 
inserted into KpnI/BamHI-digested YCF3 to generate expression vectors encoding 

1 5 modified plant ZFP-activation domain fusion polypeptides that enhance expression of the 
Arabidopsis thaliana GMT gene. 

Example 7. Modified ZFP Designs for Regulation of an Arabidopsis thaliana 
gamma tocopherol methyltransferase (GMT) Gene 

20 Modified zinc finger proteins were designed to recognize various target sequences 

in the Arabidopsis GMT gene (GenBank Accession Number AAD38271). These proteins 
were modified in two ways. First, they contained a plant backbone as described in 
Example 4. Second, they contained a non-canonical (C 2 HC) third zinc finger in which the 
second zinc coordinating histidine of a canonical C2H2 structure was converted to a 

25 cysteine. Table 3 shows the nucleotide sequences of the various GMT target sites, and the 
amino acid sequences of zinc fingers that recognize the target sites. Sequences encoding 
these binding domains were prepared as described in Example 4 and inserted into YCF3 as 
described in Example 6. 
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Table 3 



ZFP# 


Target 


Fl 


F2 


F3 


1 


GTGGACGAGT 
(SEQ ID NO: 39) 


RSDNLAR 
(SEQ ID NO:40) 


DRSNLTR 
(SEQ ID NO: 41) 


RSDALTR 
(SEQ ID NO: 42) 


2 


CGGGATGGGT 
(SEQ ID NO: 43) 


RSDHLAR 
(SEQ ID NO: 44) 


TSGNLVR 
(SEQ ID NO: 45) 


RSDHLRE | 
(SEQ ID N0:46) 


3 


TGGTGGGTGT 
(SEQ ID NO:47) 


RSDALTR 
(SEQ ID NO: 48) 


RSDHLTT 
(SEQ ID NO: 49) 


RSDHLTT 
(SEQ ID NO:50) 


4 


GAAGAGGATT 
(SEQ ID NO: 51) 


QSSNLAR 
(SEQ ID NO: 52) 


RSDNLAR 
(SEQ ID NO: 53) 


QSGNLTR 
(SEQ ID NO: 54) 


5 


GAGGAAGGGG 
(SEQ ID NO: 55) 


RSDHLAR 
(SEQ ID.NO:56) 


QSGNLAR 
(SEQ ID NO: 57) 


RSDNLTR 
(SEQ ID NO: 58) 


6 


TGGGTAGTC 
(SEQ ID NO: 59) 


ERGTLAR 
(SEQ ID NO: 60) 


QSGSLTR 
(SEQ ID NO: 61) 


RSDHLTT 
(SEQ ID NO: 62) 


7 


GGGGAAAGGG 
(SEQ ID NO: 63) 


RSDHLTQ 
(SEQ ID NO: 64) 


QSGNLAR 
(SEQ ID NO: 65) 


RSDHLSR 
(SEQ ID NO: 66) 


8 


GAAGAGGGTG 
(SEQ ID NO: 67) 


QSSHLAR 
(SEQ ID NO: 68) 


RSDNLAR 
(SEQ ID NO: 69) 


QSGNLAR 
(SEO ID NO* 7 0) 


9 


GAGGAGGATG 
(SEQ ID NO: 71) 


QSSNLQR 
(SEQ ID NO: 72) 


RSDNALR 
(SEQ ID NO: 73) 


RSDNLQR 
(SEQ ID NO: 74) 


10 


GAGGAGGAGG 
(SEQ ID NO: 75) 


RSDNALR 
(SEQ ID NO: 76) 


RSDNLAR 
(SEQ ID NO: 77) 


• RSDNLTR 
(SEQ ID NO: 7 8) 


11 


GTGGCGGCTG 
(SEQ ID NO: 79) 


QSSDLRR 
(SEQ ID NO: 80) 


RSDELQR 
(SEQ ID NO: 81) 


RSDALTR 
(SEQ ID NO: 82) 


12 


TGGGGAGAT 
(SEQ ID NO: 83) 


QSSNLAR 
(SEQ ID NO: 84) 


QSGHLQR 
(SEQ ID NO: 85) 


RSDHLTT 
(SEQ ID NO:86) 


13 


GAGGAAGCT 
(SEQ ID NO: 87) 


QSSDLRR 
(SEQ ID NO: 88) 


QSGNLAR 
(SEQ ID NO: 89) 


RSDNLTR 
(SEQ ID NO: 90) 


14 


GCTTGTGGCT 
(SEQ ID NO: 91) 


DRSHLTR 
(SEQ ID NO: 92) 


TSGHLTT 
(SEQ ID NO: 93) 


QSSDLTR 
(SEQ ID NO: 94) 


15 


GTAGTGGATG 
(SEQ ID NO: 95) 


QSSNLAR 
(SEQ ID NO: 96) 


RSDALSR 
(SEQ ID NO: 97) 


QSGSLTR 
(SEQ ID NO:98) 


16 


GTGTGGGATT 
(SEQ ID NO: 99) 


QSSNLAR 
(SEQIDNO:100) 


RSDHLTT 
(SEQH)NO:101) 


RSDALTR 
(SEQIDNO:102) 



Example 8: Modulation of Expression of an Arabidopsis thaliana gamma 
5 tocopherol methyltransferase (GMT) Gene 

Arabidopsis thaliana protoplasts were prepared and transfected with plasmids 
encoding modified ZFP-activation domain fusion polypeptides. Preparation of protoplasts 
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and polyethylene glycol-mediated transfection were performed as described. Abel et al 
(1994) Plant Journal 5:421-427. The different plasmids contained the modified plant ZFP 
binding domains described in Table 3, inserted as Kpnl/BamHI fragments into YCF3. 
At 18 hours after transfection, RNA was isolated from transfected protoplasts, 
5 using an RNA extraction kit from Qiagen (Valencia, CA) according to the manufacturer's 
instructions. The RNA was then treated with DNase (RNase-free), and analyzed for GMT 
mRNA content by real-time PCR (TaqMan 0 ). Table 4 shows the sequences of the primers 
and probe used for TaqMan® analysis. Results for GMT mRNA levels were normalized to 
levels of 18S rRNA. These normalized results are shown in Figure 4 as fold-activation of 
10 GMT mRNA levels, compared to protoplasts transfected with carrier DNA (denoted "No 
ZFP" in Figure 4). The results indicate that expression of the GMT gene was enhanced in 
protoplasts that were transfected with plasmids encoding fusions between a transcriptional 
activation domain and a modified plant ZFP binding domain targeted to the GMT gene. 

Table 4 





SEQUENCE 


GMT forward 
primer 


5 '-AATGATCTCGCGGCTGCT-3 * (SEQ ID NO: 103) 


GMT reverse primer 


5 '-GAATGGCTGATCCAACGCAT-3 ' (SEQ ID NO:104) 


GMT probe 


5 '-TCACTCGCTCATAAGGCTTCCTTCCAAGT-3 ' (SEQ ID NO:105) 


1 8S forward primer 


5 '-TGCAACAAACCCCGACTTATG-3 9 (SEQ ID NO: 106) 


1 8 S reverse primer 


5 > -CCCGCGTCGACCTTTTATC-3 , (SEQIDNO:107) 


18S probe 


5 '-AATAAATGCGTCCCTT-3 ' (SEQ EDNO:108) 



Although the foregoing methods and compositions have been described in detail 
for purposes of clarity of understanding, certain modifications, as known to those of skill 
in the art, can be practiced within the scope of the appended claims. 
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CLAIMS 

What is claimed is: 

1. An isolated, non-canonical zinc finger binding protein comprising one or 
5 more non-canonical zinc finger components that bind to a target sequence. 

2. The isolated zinc finger binding protein of claim 1, wherein the target 
sequence is a nucleic acid sequence. 



10 



15 



3. The isolated zinc finger binding protein of claim 1 , wherein the target 
sequence is an amino acid sequence. 

4. The isolated zinc finger binding protein of claim 2, wherein the target 
sequence is DNA. 

5. The isolated zinc finger binding protein of claim 2, wherein the target 
sequence is RNA. 



6. The isolated zinc finger binding protein of any of claims 1 to 5, wherein 
20 the amino acid sequence of one or more of the zinc finger components is selected from the 
group consisting of: X 3 -B-X 2 ^.Cys-X 12 -His-X,. 7 -His.X4; X 3 -Cys-X 2 _4-B-X 12 -His- X U7 - 
His-X4; X 3 -Cys-X 2 ^-Cys-X I2 .Z.X I . 7 -His-X 4 ; Xa-Cys-X^Cys-Xn-His-X,^^; X 3 -B- 
X 2 . 4 -B«X I2 -His-X 1 . 7 .ffis-X 4 ; X 3 -B-X M -Cys-X I2 -Z-X I . 7 -His-X 4 ; X 3 -B-X 2 ^-Cys-X 12 -His- 
X,. 7 -Z-X4; X 3 -Cys-X 2 ^-B-X I2 -Z-X I . 7 -His-X 4 ; X.-Cys-X.^-B-X^-His-X^.-Z^; X 3 -Cys- 
25 X 2 _ 4 -Cys-X 12 -Z-X 1 . 7 -Z-X 4 ; X 3 «Cys-X 2 ^B.X, 2 -Z-X 1 ^Z.X 4 ; X 3 -BOC 2 _4-Cys-X 12 -Z-X,. 7 -Z- 
X,; X 3 -B-X 2 ^B-X 12 -His-X 1 . 7 -Z-X 4 ; X 3 -B-X M -B-X 12 -Z-X 1 . 7 -His-X 4 ; and X 3 -B-X 2 _4-B- 
Xi 2 -Z-Xi. 7 -Z-X4 p wherein X is any amino acid, B is any amino acid except cysteine and Z 
is any amino acid except histidine. 

30 7 ' 1116 isolated zinc finger binding protein of claim 6, wherein the zinc finger 

component comprises the sequence X.-B-X^-Cys-Xn-His-X^.-His-X,, wherein X is any 
amino acid, B is any amino acid except cysteine and Z is any amino acid except histidine. 
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8. The isolated zinc finger binding protein of claim 6, wherein the zinc finger 
component comprises the sequence X3~Cys-X2-4-B-X I2 -His-Xi. 7 -His-X4, wherein X is any 
amino acid, B is any amino acid except cysteine and Z is any amino acid except histidine. 

5 9. The isolated zinc finger binding protein of claim 6, wherein the zinc finger 

component comprises the sequence X 3 -Cys-X 2 ^-Cys-X 12 -Z-X 1 . 7 -His-X 4 , wherein X is any 
amino acid, B is any amino acid except cysteine and Z is any amino acid except histidine. 

10. The isolated zinc finger binding protein of claim 6, wherein the zinc finger 
10 component comprises the sequence X 3 -Cys-X 2 ^-Cys-Xi2-His-Xi.7-Zr-X4, wherein X is any 

amino acid, B is any amino acid except cysteine and Z is any amino acid except histidine. 

11. The isolated zinc finger binding protein of claim 6, wherein the zinc finger 
component comprises the sequence X 3 -B-X 2 ^-B-Xi 2 -His-Xi. 7 -His-X4, wherein X is any 

15 amino acid, B is any amino acid except cysteine and Z is any amino acid except histidine. 

12. The isolated zinc finger binding protein of claim 6, wherein the zinc finger 
component comprises the sequence X 3 -B-X 2 ^-Cys-Xi 2 -Z-X I .7-His-X 4 , wherein X is any 
amino acid, B is any amino acid except cysteine and Z is any amino acid except histidine. 

20 

13. The isolated zinc finger binding protein of claim 6, wherein the zinc finger 
component comprises the sequence X 3 -B-X 2 ^-Cys-X 12 -His-X,. 7 -Z-X4, wherein X is any 
amino acid, B is any amino acid except cysteine and Z is any amino acid except histidine. 

25 I 4 - The isolated zinc finger binding protein of claim 6, wherein the zinc finger 

component comprises tbe sequence X 3 -Cys-X 2 ^-B^Xi 2 -Z-X,. 7 -His-X4, wherein X is any 
amino acid, B is any amino acid except cysteine and Z is any amino acid except histidine. 

15. The isolated zinc finger binding protein of claim 6, wherein the zinc finger 
30 component comprises the sequence X 3 -Cys-X 2 ^-B^Xi 2 -His-Xi. 7 -Z-X4, wherein X is any 
amino acid, B is any amino acid except cysteine and Z is any amino acid except histidine. 
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16. The isolated zinc finger binding protein of claim 6, wherein the zinc finger 
component comprises the sequence X 3 -Cys-X 2 ^-Cys-X 12 -Z-Xi- 7 -Z-X4, wherein X is any 
amino acid, B is any amino acid except cysteine and Z is any amino acid except histidine. 

5 17. The isolated zinc finger binding protein of claim 6, wherein the zinc finger 

component comprises the sequence X 3 -CysOC 2 ^-BOCi 2 -Z-X,.7-Z-X4, wherein X is any 
amino acid, B is any amino acid except cysteine and Z is any amino acid except histidine. 

1 8. The isolated zinc finger binding protein of claim 6, wherein the zinc finger 
10 component comprises the sequence Xa-B-Xj^-Cys-X^-Z-Xj.rZ-Xt, wherein X is any 

amino acid, B is any amino acid except cysteine and Z is any amino acid except histidine. 

19. The isolated zinc finger binding protein of claim 6, wherein the zinc finger 
component comprises the sequence X 3 -B-X 2 ^-B-X 12 ~His-X,,7-Z-X4, wherein X is any 

15 amino acid, B is any amino acid except cysteine and Z is any amino acid except histidine. 

20. The isolated zinc finger binding protein of claim 6, wherein the zinc finger 
component comprises the sequence X 3 -B-X 2 ^-B-X 12 -Z-X 1 . 7 -His-X 4 , wherein X is any 
amino acid, B is any amino acid except cysteine and Z is any amino acid except histidine. 

20 

21. The isolated zinc finger binding protein of claim 6, wherein the zinc finger 
component comprises the sequence X 3 -B-X 2 ^-B-X 12 -Z-X I . 7 -Z-X 4 , wherein X is any 
amino acid, B is any amino acid except cysteine and Z is any amino acid except histidine. 

25 22 - isolated zinc finger binding protein of any of claims 1 to 2 1 , wherein 

the target sequence is in a plant cell. 

23. The isolated zinc finger binding protein of any of claims 1 to 2 1 , wherein 
the target sequence is in an animal cell. 



30 



24. The isolated zinc finger binding protein of claim 23, wherein the target 
sequence is in a human cell. 
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25. The isolated zinc finger binding protein of any of claims 1 to 24, wherein 
the target sequence is a promoter sequence. 

26. The isolated zinc finger binding protein of any of claims 1 to 25, 
5 comprising three zinc finger components. 

27. The isolated zinc finger binding protein of any of claims 1 to 26, wherein 
the target sequence comprises about 9 to about 14 contiguous base pairs. 

10 28. The isolated zinc finger binding protein of claim 26, wherein the third 

finger component comprises a non-canonical zinc finger component 

29. The isolated zinc finger binding protein of any of claims 1 to 28, 
comprising a modified plant ZFP backbone. 

15 

30. An isolated polynucleotide encoding a zinc-finger binding protein 
according to any of claims 1 to 29. 

31. An expression vector comprising the polynucleotide of claim 30. 

20 

32. A host cell comprising the polynucleotide of claim 30. 

33. A fusion polypeptide comprising; (a) an isolated zinc finger binding protein 
according to any of claims 1 to 29 and (b) at least one functional domain. 

25 

34. The fusion polypeptide of claim 33, wherein the functional domain is a 
repressive domain. 

35. The fusion polypeptide of claim 34, wherein the repressive domain is 

30 selected from the group consisting of KRAB, MBD-2B, v-ErbA, MBD3, TR and members 
oftheDNMT family. 

36. The fusion polypeptide of claim 35, wherein the functional domain is an 
activation domain. 
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37. The fusion polypeptide of claim 36, wherein the activation domain is 
selected from the group consisting of VP 16, p65 subunit of NF-kappa B, and VP64. 

5 38. The fusion polypeptide of claim 37, wherein the functional domain is 

selected from the group consisting of an insulator domain, a chromatin-remodeling protein 
or a methyl-binding domain. 

39. An isolated polynucleotide encoding the fusion polypeptide of any of 
10 claims 33 to 38. 

40. An expression vector comprising the polynucleotide of claim 39. 

41. A host cell comprising the polynucleotide of claim 39. 

15 

42. A method of modulating expression of a gene, the method comprising the 
step of contacting a region of DNA with a fusion molecule according to any of claims 33 
to 38. 

20 43. The method of claim 42, wherein the zinc finger binding protein of the 

fusion molecule binds to a target site in a gene encoding a product selected from the group 
consisting of vascular endothelial growth factor, erythropoietin, androgen receptor, 
PPAR-y2, pi 6, p53, pRb, dystrophin and e-cadherin. 

25 44. The method of claim 42 or claim 43, wherein the gene is in a plant cell. 

45. The method of claim 42 or claim 43, wherein the gene is in an animal cell. 

46. The method of claim 45, wherein the gene is in a human cell. 



30 



47. A pharmaceutical composition comprising a non-canonical zinc finger 
protein according to any one of claims 1 through 29 and 33 through 38 and a 
pharmaceutical^ acceptable excipient. 



57 



WO 02/057293 



1/5 



PCT/US02/01893 



LCK/18S 




WO 02/057293 



2/5 



PCT/US02/01893 



VEGF-A (pg/ml) 



3 
(Q 
CD 
-» 
CO 

a 
© 

CO 
3 



x 



jo ^ a> od o n> 

_ o o o o o o 

p o p o o o o 

o b b b b b b 



fc*r 

















; vi? 










' '.~<Ti •*.*• 



$$$$$$$$$$$$$$$$ 



••88 



H □ 

< < 

o o 

-a -a 

o> CO 

n> o 

DO > 



-■-fits . 



o 
d 



WO 02/057293 



3/5 



PCT/US02/01893 




WO 02/057293 



4/5 



PCT/US02/01893 




WO 02/057293 



5/5 



PCT/US02/01893 



) — » 

o 



7 



% 



v* 



X 

% 



% 

\ 



3b 



fold of activation 



en o> 



CO 
CD 
=3. 
CD 
CO 



o 

<" 
r-P- 

o" 



